The four things that make up an agent's context
First, the prompts: the system prompt that sets the agent's role and rules, plus the user input that started the run. These are the part you author directly. Second, memory: the running history of the conversation or task — earlier turns, prior decisions, summaries of what's happened. This grows as the run proceeds, and it's where things quietly fall out when the window fills up. Third, tool outputs: every result a tool returned is context for the next decision, and a malformed or misread result poisons everything downstream.
Fourth, state: the agent's sense of where it is in the task — what it's trying to do right now, what it's already done, what's left. Some agents track this explicitly; in others it's implicit in the conversation. All four travel together into each model call, and the model treats them as one undifferentiated view of the world. It can't tell a stale memory from a fresh fact, or a wrong tool result from a right one — it just reasons over whatever is there. That's why the contents of context, not just the prompt, decide what the agent does.
Why capturing context is the heart of debugging
When an agent makes a baffling decision, the first question is always: what did it actually see? Not what you think it saw — what was genuinely in its context at that moment. The answer explains the behavior almost every time. The model picked the wrong tool because an earlier result framed the problem differently. It repeated itself because the memory of having already done the task dropped out of the window. It hallucinated a value because the real one was never in context to begin with.
You can't answer 'what did it see?' from the agent's source code, because context is assembled at runtime from prompts, memory, and tool results that change with every run. You have to capture it. That means recording, per step, the full context that went into the model and the reply that came out. With that in hand, a confusing agent becomes legible: you read the context, you see why the decision made sense to the model, and you fix the context rather than blaming the model. agentis records this context alongside each step of the run so the 'what did it see?' question always has a concrete answer.
frequently asked
What's the difference between a prompt and an agent's context?
The prompt is what you author — the system instructions and the user input. Context is everything the model can see at decision time, which includes the prompt but also the running memory of earlier turns, the results of tools it called, and its current state. The model acts on the whole context, not just the prompt, which is why debugging means looking at all of it.
Why do so many agent bugs come down to context?
Because the model only reasons over what's in its context window. If a tool result was wrong, a memory went stale, or a crucial earlier turn dropped out, the model makes a sensible decision based on bad information. The reasoning isn't broken — the inputs were. Most baffling agent behavior resolves the moment you see the actual context the model had.
How does context fall out of the window?
Context windows are finite, so as a run accumulates turns and tool results, older content has to be dropped or summarized to fit. If something load-bearing — an instruction, an earlier result, the original goal — gets trimmed, the model loses it and starts acting as if it never existed. Capturing per-step context lets you see exactly when something important disappeared.
How do I capture an agent's context for debugging?
Record, for each model call, the full context that went in (prompt, memory, tool results, state) and the reply that came out, tagged to the run. Then you can read what the model actually saw at the failing step. agentis captures this per step from the logs you post, so the context behind each decision is always available rather than reconstructed from guesswork.
Last updated June 19, 2026