Replay a Production Incident
Verify your fix against the exact requests that caused the incident — before you deploy.
Scenario: you've fixed the Stripe timeout bug. You want to confirm the incident is resolved without deploying to production blind.
Step 1 — Identify the incident time window
From flux tail, your alert, or your monitoring tool, find the start and end time of the incident:
$ flux tail --filter status=500 --since 2h
POST /signup 500 44ms req:550e8400 14:22:01
POST /signup 500 51ms req:7a8b9c0d 14:22:44
POST /signup 500 38ms req:1b2c3d4e 14:23:10
# Incident window: 14:00 → 14:30
Step 2 — Replay that window against your current code
$ flux incident replay 14:00..14:30
Replaying 47 requests from 14:00–14:30…
Side-effects: hooks off · events off · cron off
Database writes: on · mutation log: on
✔ req:4f9a3b2c POST /create_user 200 81ms
✔ req:a3c91ef0 GET /list_users 200 12ms
✔ req:550e8400 POST /signup 200 88ms ← was 500
✔ req:7a8b9c0d POST /signup 200 91ms ← was 500
47 replayed · 47 passing · 0 still failing ✔ incident resolved
All passing — safe to deploy.
What is safe during replay
| Side-effect type | During replay | Notes |
|---|---|---|
| Outbound HTTP (webhooks, Stripe, Slack) | Stubbed | Returns the recorded response from the original trace |
| Email / SMS sending | Disabled | Send calls return { status: "stubbed" } |
| Cron / scheduled jobs | Disabled | No jobs are dispatched during replay |
| Async job enqueue | Recorded, not dispatched | Jobs appear in mutation log but don't run |
| Database reads | Live | Reads your current database state |
| Database writes | Enabled | Mutations are recorded in the log for comparison |
Important: because database writes are enabled, replaying against a development database is recommended if your production schema is sensitive. Replay against production is safe for read-heavy workloads but will produce real DB mutations.
Replay a single request
If you only want to replay one specific request:
$ flux incident replay --request 550e8400
Replaying req:550e8400…
✔ POST /signup 200 88ms ← was 500
How determinism works
Replay uses the request_input from the original execution record (HTTP body, headers, auth context) as the input to the re-execution. Outbound API calls return the same recorded responses from the original trace, so your function sees identical data at every step — even if the external service would return different data today.
See the production guide for a detailed breakdown of the determinism guarantees.
← Debug a Production Incident · Inspect Database Mutations →