Debugging Guide

Debugging Production Systems

Four commands cover 95% of production debugging. Here's how they work together as a complete workflow — from noticing a failure to proving the fix.

1
flux tail

Stream live requests — spot failures instantly

flux tail streams every request hitting your Flux functions in real time. Successes appear in green. Failures appear in red with an inline error summary.

The request ID on failures is your debugging handle. Copy it — you'll pass it to the next command.

Unlike log tailing, flux tail shows you the full execution context: method, path, status, latency, and error summary — all in one line.

Useful flags
--fn payments  filter to one function
--errors-only  hide 2xx requests
--since 30m  replay recent requests
flux tail
$ flux tail

  Streaming live requests…

    POST /signup    201  88ms   req:4f9a3b2c
    GET  /users     200  12ms   req:a3c91ef0
    POST /checkout  200  192ms  req:b7e3d12f
    POST /signup    500  44ms   req:550e8400
     └─ Error: Stripe API timeout
    GET  /products  200  9ms    req:cc4a8e71
2
flux why

Root-cause the failure — one command

flux why takes the request ID from flux tail and gives you everything you need to understand the failure:

  • Root cause — the first error in the span tree
  • Location — exact file and line in your code
  • Data changes — every row that was mutated (including rolled-back writes)
  • Suggestion — AI-generated fix hint based on the error pattern

This is enough to fix most failures. If you need more detail, continue to step 3.

flux why 550e8400
$ flux why 550e8400

  ROOT CAUSE
  Stripe API timeout after 10s

  LOCATION
  payments/create.ts : line 42

  DATA CHANGES
  users id=42
    plan : free → null   (rolled back)

  SUGGESTION
   Add 5s timeout + idempotency key retry
   Consider moving to async background step
3
flux trace debug

Step through spans interactively

flux trace debug puts you in an interactive span-by-span walkthrough of the execution. Think of it like git bisect but for a single request's execution steps.

Each span shows you:

  • The exact input that was passed to that span
  • The exact output (or error) it returned
  • The duration and any DB state changes it caused

Use this when flux why identifies the failure but you want to understand exactly what the function received and returned at each step.

flux trace debug 550e8400
$ flux trace debug 550e8400

  Span 1/5  gateway             2ms
  in  : POST /signup  {email, plan}
  out : → runtime  request_id=550e8400

  Span 2/5  create_user         4ms
  in  : {email: "alice@..", plan: "pro"}
  out : users.id = 42  (inserted)

  Span 3/5  stripe.create           timeout
  in  : {customer_id: "cus_123", amount: 2900}
  out : Error: Request timeout after 10011ms

  [n]ext  [p]rev  [q]uit
4
flux trace diff

Verify your fix — diff two traces

After you fix the bug and deploy, trigger the same scenario and capture the new request ID. Then flux trace diff compares the two executions side-by-side:

  • + New spans that were added
  • Spans that changed or were removed
  • Mutation changes (what the DB looks like now)

This is how you prove your fix worked — not just "it seems fine" but "here is exactly how the execution changed".

flux trace diff
$ flux trace diff 550e8400 7b1d3f9a

  Comparing traces…

  Span diff
  stripe.create
   duration  : 10011ms  (timeout)
  + duration  : 142ms    (success)
  + idempotency_key: ik_1234  (new)

  Mutation diff
  users.id=42
   plan: null  (rolled back)
  + plan: pro   (committed)

   Fix verified: idempotency key + 5s timeout')}
Going Deeper

For harder incidents, go further.

flux state blame --id

Which request changed this row?

When you find corrupted data and don't know which request caused it, flux state blame gives you the full history of any database row.

flux incident replay --window -

Replay the incident safely

Re-runs a production time window against your fixed code with all side-effects disabled. Emails won't send. Payments won't process.

flux bug bisect

Find the commit that broke it

Binary-searches your git history to find the first commit where a request started failing. Like git bisect but for production behaviour.

flux explain

Ask AI for analysis

Sends a trace to an LLM with full context — spans, mutations, errors — and returns a detailed diagnosis and suggested fix. Dry-run safe by default.

Want the hands-on version?

The quickstart tutorial walks through all four steps with a real function that deliberately fails.