Why Flux Exists

The problem with production debugging today — and the idea that fixes it.

The problem

A request fails in production. You get an alert. Now what?

With a traditional backend you open five tools in five tabs:

Tool	What you find	What's missing
Log aggregator	A stack trace with a timestamp	No request body, no DB state
Metrics dashboard	A spike in p99 latency	No idea which request caused it
Trace UI	Spans — if you remembered to instrument	Requires SDK setup, no DB mutations
DB console	Current row state	No history of what changed it
Queue dashboard	A failed job	No link back to the HTTP request that enqueued it

Each tool holds a fragment of evidence. None of them share a common identifier. You spend more time correlating evidence than understanding the bug.

This is the normal production debugging experience in 2026. It is slow, frustrating, and completely avoidable.

The insight

Every production bug is caused by a request that did something unexpected.

If you had a complete record of exactly what that request did — every span, every database write, every input and output — you could root-cause any bug in seconds instead of hours.

That record already exists in your system. It is scattered. Flux assembles it.

The solution: execution history

Flux runs your backend inside a runtime that records every request automatically — no instrumentation, no SDK, no configuration. The result is an execution record: a single object that contains everything that request did.

Request: POST /signup  req:550e8400

  INPUT       { email: "a@b.com" }
  SPANS       gateway → create_user → stripe.charge → db.insert
  MUTATIONS   users.plan: free → null  (rolled back)
  ERROR       Stripe API timeout at payments/create.ts:42
  RESPONSE    500  44ms

With that record, debugging collapses into one command:

$ flux why 550e8400

  ROOT CAUSE   Stripe API timeout after 10s
  LOCATION     payments/create.ts:42
  DATA CHANGES users.id=42  plan: free → null  (rolled back)
  SUGGESTION   → Add 5s timeout + idempotency key retry

And if you want to verify your fix before deploying, you can replay the exact requests that caused the incident against your current code:

$ flux incident replay 14:00..14:05

  23 replayed · 22 passing · 1 still failing  ← fix not complete yet

The analogy

Git records the history of your code. Every change is stored, attributable, diffable, and revertable.

Flux records the history of your code executing. Every request is stored, attributable, diffable, and replayable.

Git       → what your code looked like
Flux  → what your code did

These are complementary. Git tells you what changed. Flux tells you what happened when it ran.

What this enables

Capability	Command	What was previously required
Root-cause any failure	`flux why <id>`	Grep logs, correlate trace IDs manually
See every DB change a request made	`flux state history`	DB console + guesswork
Replay an incident safely	`flux incident replay`	Not possible — side-effects make it unsafe
Find the breaking commit	`flux bug bisect`	Manual git bisect + redeploy loop
Compare two executions	`flux trace diff`	Not possible across tools

Open source and self-hosted

Flux is open source (MIT). The runtime — gateway, runtime engine, data engine, queue — ships as a single binary. Run it on your own infrastructure with one Postgres database as the only dependency.

$ flux init my-app && cd my-app
$ flux dev   # starts Flux + embedded Postgres on localhost:4000

No cloud account. No API keys. No telemetry. Your data stays on your server.

Why a runtime, not an SDK?

The first question developers ask: "Can't I just add a library to my Express app?"

No. Here's why:

Capability	Bolt-on SDK	Flux runtime
Trace spans	✔ (manual instrumentation)	✔ (automatic)
Record DB mutations in the same transaction	✗ Sits outside the database driver	✔ Data engine wraps the query
Replay production traffic with side-effects disabled	✗ Can't intercept outbound calls	✔ Runtime controls all I/O
Bisect across git history	✗ Can't re-deploy and re-execute per commit	✔ Runtime manages deploys
Link every span to exact code version	✗ No deploy awareness	✔ Knows the code SHA for every execution

An SDK can tell you that something failed. A runtime can tell you exactly what happened and let you replay it.

The tradeoff is explicit: your code runs inside Flux. But migration is wrapping, not rewriting — req.body becomes input, res.json() becomes return, db.query() becomes ctx.db.query(). Start with one endpoint. Run Flux alongside your existing stack. Migrate at your own pace.

What Flux is not

Not a log aggregator. Logs are unstructured. Execution records are structured around request IDs and span trees.
Not an APM / metrics platform. APMs tell you something is wrong. Flux tells you exactly what happened and lets you replay it.
Not a distributed tracing system. Tracing requires SDK instrumentation. Flux records automatically at the runtime level.
Not a serverless platform. Flux runs your functions, but it's not a cloud compute layer. It's a self-hosted runtime where the primary value is execution history and debugging — not autoscaling to zero.

← Documentation · Quickstart →