use case

Grammar-constrained decoding, explained: why the output is valid by construction

the short answer

Grammar-constrained decoding applies your language's grammar as a mask over the model's token choices at each decoding step, so only tokens that keep the output inside the grammar can be sampled, which makes the result valid by construction rather than likely; dsl.ai compiles a grammar you paste into exactly this constraint, with no training set or GPU.

Grammar-constrained decoding is a way of making a language model produce output that always belongs to a specific language — JSON, a programming language, or your own DSL. Instead of asking the model nicely and checking afterwards, it changes what the model is even allowed to emit, step by step, using the language's grammar as the rulebook.

The payoff is a guarantee rather than a probability: output isn't merely likely to be valid, it's valid by construction. This explainer covers what it is, how it works, and how it differs from the alternatives, then points to where dsl.ai fits.

by constructionvalidity guaranteed, not made more likely

What it is

A language model generates text one token at a time, each drawn from a probability distribution over its whole vocabulary. Grammar-constrained decoding inserts a step between that distribution and the sampling: it consults your grammar, works out which tokens could legally come next given what's been generated so far, and masks out every token that would break the language. The model then samples only from what's left.

Do that at every step and the running output can never leave the grammar, because there's never a moment where an illegal token is available to pick. When generation finishes, the result is guaranteed to be a valid string in your language. This is the mechanism behind GBNF grammars in llama.cpp and libraries like Outlines and XGrammar.

Why the output is valid by construction

'Valid by construction' means the validity isn't checked after the fact — it's a property of how the output was built. At no point during generation could the model have produced something invalid, so there's no failure mode to catch. Compare that with prompting ('please output valid syntax') or fine-tuning, both of which only shift probabilities and leave a nonzero chance of a token that breaks the language.

The constraint only ever removes illegal options; it never picks for the model. Among the tokens the grammar allows, the model chooses freely, so the content is still entirely the model's — it's just always well-formed. Syntax is guaranteed; meaning is still the model's job, which is why for hard semantic cases you might add retrieval or, optionally, fine-tune on top.

How dsl.ai uses it

dsl.ai is the part that turns your grammar into that constraint. You paste your DSL's EBNF/GBNF-style grammar into the browser playground — no account, no GPU, no training set — and it compiles the grammar into the decoding mask and, from the same grammar, a deterministic validator. The exact grammar you test drops into a hosted open model in production, so what you prove in the playground is what runs.

Three ways to get valid output from an LLM

ApproachWhat it doesValidity
PromptingAsks the model to follow the rulesLikely, never guaranteed
Fine-tuningShifts probabilities toward examplesMore likely, never certain
Grammar-constrained decodingMasks illegal tokens each stepValid by construction

frequently asked

Does grammar-constrained decoding hurt output quality?
No. It only removes tokens that would break the grammar; the model still chooses freely among every valid option, so quality of meaning is unaffected while syntax becomes guaranteed.
Does it guarantee the output is correct, or just valid?
Just syntactically valid — it guarantees the output belongs to your language, not that it does the right thing. Semantics are still the model's responsibility; add validation rules or fine-tuning for hard semantic cases.
What do I need to use it?
A grammar for your language and an open model served through a runtime that accepts a GBNF-style grammar. dsl.ai compiles the grammar for you from a file you paste, with no GPU or training data.
How is it different from JSON mode?
JSON mode is grammar-constrained decoding fixed to the JSON grammar. The general technique lets you supply any grammar, so the same guarantee applies to your own DSL or query language.

Last updated June 8, 2026

ready to try dsl.ai?

open dslai