Common Tasks

Copy-paste workflows for the things you actually do in production.

Debug a production incident

Scenario: a user reports that checkout failed. You have no idea why.

Step 1 — Stream live requests and find the failed request ID:

$ flux tail

  POST /checkout    201  120ms  req:4f9a3b2c
  POST /checkout    500   44ms  req:550e8400
     └─ Error: Stripe API timeout

Step 2 — Get the root cause instantly:

$ flux why 550e8400

  ROOT CAUSE   Stripe API timeout after 10s
  LOCATION     payments/create.ts:42
  DATA CHANGES users.id=42  plan: free → null  (rolled back)
  SUGGESTION   → Add 5s timeout + idempotency key retry

Step 3 — If you need to see the full execution:

$ flux trace 550e8400

  gateway                   2ms
  └─ create_order           8ms
     ├─ db.insert(orders)   4ms
     ├─ stripe.charge     180ms  ← timeout here
     └─ send_slack          — skipped (upstream error)

Total time from alert to root cause: under 30 seconds.

Replay a failed request

Scenario: you fixed the Stripe timeout bug. You want to verify the incident is resolved before deploying.

Step 1 — Identify the time window of the incident (from flux tail or your alert).

Step 2 — Replay that window against your current code:

$ flux incident replay 14:00..14:05

  Replaying 23 requests from 14:00–14:05…

  Side-effects: hooks off · events off · cron off
  Database writes: on · mutation log: on

  ✔  req:4f9a3b2c  POST /create_user   200  81ms
  ✔  req:a3c91ef0  GET  /list_users    200  12ms
  ✗  req:550e8400  POST /signup        500  44ms
     └─ Still failing: Stripe timeout

  23 replayed · 22 passing · 1 still failing

What is safe during replay?

Side-effectDuring replay
Outbound HTTP (webhooks, Stripe, Slack)Disabled — requests are stubbed
Email sendingDisabled
Cron / scheduled jobsDisabled
Database writesEnabled — runs against current schema
Mutation logEnabled — can be compared to original
Async job enqueueingDisabled — jobs are recorded but not executed

Replay is always against your current deployed code for the current environment. It does not affect production data unless you explicitly target the production environment.

Inspect database mutations

Scenario: a database row is in a wrong state and you don't know what changed it.

Get full mutation history for a row:

$ flux state history users --id 42

  users id=42  (4 mutations)

  2026-03-10 12:00  INSERT  email=a@b.com, plan=free
  2026-03-10 14:21  UPDATE  name: null → Alice Smith    req:a3c91ef0
  2026-03-10 14:22  UPDATE  plan: free → pro            req:4f9a3b2c
  2026-03-10 14:22  UPDATE  plan: pro → null (rollback) req:550e8400

Find which request owns each column's current value:

$ flux state blame users --id 42

  email   a@b.com   req:4f9a3b2c  12:00:00
  name    Alice     req:a3c91ef0  14:21:59
  plan    free      req:550e8400  14:22:01  ✗ rolled back

Each req: ID can be passed to flux why or flux trace to understand exactly what caused that mutation.

Compare two executions

Scenario: the same endpoint started behaving differently after a deploy. You want to see exactly what changed.

Diff two request traces span-by-span:

$ flux trace diff 4f9a3b2c 550e8400

  SPAN              BEFORE       AFTER
  ─────────────────────────────────────────────────────
  gateway           2ms          2ms          — same
  create_order      81ms         44ms         — faster
  ├─ db.insert      4ms          4ms          — same
  ├─ stripe.charge  68ms         → timeout    ✗ changed
  └─ send_slack     7ms          — skipped    ✗ missing

You can diff any two requests — they don't need to be the same endpoint. This is useful for A/B comparisons or checking how behaviour changed across a deploy.

Find the commit that broke a request

Scenario: a request that was passing last week is now failing. You have 40 commits to search through.

Step 1 — Find the failing request ID with flux tail or from your logs.

Step 2 — Run bisect:

$ flux bug bisect --request 550e8400

  Bisecting 42 commits (2026-03-01..2026-03-10)…

  Testing abc123…  ✔ passes
  Testing fde789…  ✔ passes
  Testing def456…  ✗ fails

  FIRST BAD COMMIT
  def456  "feat: add retry logic to stripe.charge"
  2026-03-08 by alice@example.com

  → Compare before/after:
     flux trace diff abc123:550e8400 def456:550e8400

How bisect works:

  1. Checks out each commit in bisect order
  2. Re-runs the request input against the code at that commit
  3. Compares the response + span signature to determine pass/fail
  4. Continues until it isolates the first failing commit

This requires your project to be a git repository with flux auth configured for CI use.


← Execution Record  ·  Examples →