how to

How to trace an AI agent's execution path

the short answer

You trace an AI agent's execution path by assigning each run a single ID, emitting one structured log entry per step (the model call's prompt and reply, and each tool call's arguments and result), sending them all tagged with that run ID, then reading them back in order — which reconstructs the exact sequence of decisions the agent made from start to finish.

An execution path is the ordered record of everything an agent did during one run: each model call, each decision, each tool invocation, and each result, in the sequence they actually happened. It's the agent equivalent of a stack trace, except instead of function frames it shows reasoning steps. Once you can see the path, an opaque agent becomes a readable story — you can follow what it was thinking and where it went wrong.

agentis.ogbuilds.ai/agents/support-triage/run/7e9f
agentis
agentssnapshotsdebug
support-triage
run-id: 7e9f-4b2c-a1d8
7 steps1 error
execution path
#01actionstart run · ticket #4825
#02llm·callclassify intent → billing dispute
#03tool·callcrm.getCustomer(88213)
#04llm·calldecide: load customer invoices
#05tool·callbilling.getInvoices — ETIMEDOUT
#06decisionretry? no (3 attempts) → fallback
#07llm·calldraft holding reply + escalate

where this happens in the app

the execution path assembled from 7 tagged steps: one run-id groups them, sequence numbers order them, and type labels (action, llm·call, tool·call, decision) tell you what each one was.

  1. 1the run-id at the top — one id per run, stamped on every step so they reassemble into a single ordered path from scattered log entries
  2. 2step types in sequence — action, llm·call, tool·call, decision: every kind of step the agent took, labeled and numbered so the path is readable top to bottom
  3. 3step 5 in red — the tool·call that failed; reading from the top, this is the first step where the run diverged from the intended path

What belongs in a trace

A trace is only as useful as the detail on each step. For every model call, capture the prompt that went in and the full reply that came back — including any tool call the model requested and the arguments it chose. For every tool call, capture the tool's name, its inputs, and exactly what it returned (success payload or error). Stamp each entry with the run ID so they group together, and a sequence number or timestamp so they order correctly. That's the spine of a readable trace.

Optional extras sharpen it: the agent's current goal or state at each step, token counts if you're watching cost, and latency per call if you're chasing slowness. But don't over-capture. The failure mode of tracing is drowning the signal — a trace with everything in it is as hard to read as no trace at all. Capture the decision points and what flowed through them; skip the noise.

Reading the trace once you have it

A good trace is read in order, not searched. Start at the top and follow the agent's reasoning forward: it saw this context, decided to call that tool, got this result, decided again. You're looking for the first step where the path stops matching what you intended — the model misread a result, called the wrong tool, or looped on something that kept failing. That first divergence is almost always the bug, and everything after it is fallout.

Loops deserve special attention. If you see the same tool called repeatedly with the same failing result, the agent is stuck, and the trace shows you exactly the cycle it's caught in. Reading the path as one continuous sequence — rather than hunting through scattered log lines — is what makes that pattern jump out. agentis renders the path this way by default and can summarize a long run or point you at the suspicious step when there are too many to skim.

how it works

  1. 01

    Give each run one ID

    At the start of an agent run, generate a unique run identifier. Pass it through every step so each model call and tool call can be tagged with it. This is what lets you group scattered steps back into one path.

  2. 02

    Log every model call

    For each call to the model, record the prompt sent and the full reply received, including any tool call the model requested and its arguments. Tag it with the run ID and a sequence number.

  3. 03

    Log every tool call

    For each tool the agent invokes, record the tool name, the arguments it was given, and the result it returned (including errors). Tag these with the same run ID so they interleave correctly with the model calls.

  4. 04

    Send the steps somewhere readable

    Post the tagged steps to a destination that assembles them in order — a structured log store you query, or an endpoint like agentis that builds the run into a snapshot. The goal is one ordered view per run.

  5. 05

    Read the path in sequence

    Walk the assembled trace from the top, following the agent's reasoning forward. Stop at the first step that diverges from what you intended — that's where to debug, and the rest is usually downstream of it.

frequently asked

What exactly is an execution path?

It's the ordered list of everything an agent did in one run — each model call with its prompt and reply, and each tool call with its arguments and result — in the sequence they happened. Think of it as a stack trace for reasoning: instead of function frames, it shows the steps and decisions the agent actually took.

How do I tie scattered log lines back into one path?

Generate a single run ID at the start of each agent run and attach it to every step you log. When you read the logs back, you group by that run ID and order by a sequence number or timestamp. The shared ID is what turns disconnected lines into one coherent trace.

How much should I capture per step?

Enough to debug, no more. The essentials are the prompt and model reply for each model call, and the name, arguments, and result for each tool call. Goal or state and token counts are useful extras. Avoid dumping everything — an over-stuffed trace is as unreadable as no trace, so capture the decision points and what flowed through them.

Can I trace an agent built in any language or framework?

Yes, if you capture the universal pieces — prompts, model replies, tool calls, and results — and tag them with a run ID. None of that is framework-specific. agentis accepts these steps over a generic HTTP endpoint, so an agent in any language that can make an HTTP request can have its path traced and rendered.

Last updated June 19, 2026

ready to try agentis?

get started