Why Flux Exists

The problem with production debugging today — and the idea that fixes it.

The problem

A request fails in production. You get an alert. Now what?

With a traditional backend you open five tools in five tabs:

ToolWhat you findWhat's missing
Log aggregatorA stack trace with a timestampNo request body, no DB state
Metrics dashboardA spike in p99 latencyNo idea which request caused it
Trace UISpans — if you remembered to instrumentRequires SDK setup, no DB mutations
DB consoleCurrent row stateNo history of what changed it
Queue dashboardA failed jobNo link back to the HTTP request that enqueued it

Each tool holds a fragment of evidence. None of them share a common identifier. You spend more time correlating evidence than understanding the bug.

This is the normal production debugging experience in 2026. It is slow, frustrating, and completely avoidable.

The insight

Every production bug is caused by a request that did something unexpected.

If you had a complete record of exactly what that request did — every span, every database write, every input and output — you could root-cause any bug in seconds instead of hours.

That record already exists in your system. It is scattered. Flux assembles it.

The solution: execution history

Flux runs your backend inside a runtime that records every request automatically — no instrumentation, no SDK, no configuration. The result is an execution record: a single object that contains everything that request did.

Request: POST /signup  req:550e8400

  INPUT       { email: "a@b.com" }
  SPANS       gateway → create_user → stripe.charge → db.insert
  MUTATIONS   users.plan: free → null  (rolled back)
  ERROR       Stripe API timeout at payments/create.ts:42
  RESPONSE    500  44ms

With that record, debugging collapses into one command:

$ flux why 550e8400

  ROOT CAUSE   Stripe API timeout after 10s
  LOCATION     payments/create.ts:42
  DATA CHANGES users.id=42  plan: free → null  (rolled back)
  SUGGESTION   → Add 5s timeout + idempotency key retry

And if you want to verify your fix before deploying, you can replay the exact requests that caused the incident against your current code:

$ flux incident replay 14:00..14:05

  23 replayed · 22 passing · 1 still failing  ← fix not complete yet

The analogy

Git records the history of your code. Every change is stored, attributable, diffable, and revertable.

Flux records the history of your code executing. Every request is stored, attributable, diffable, and replayable.

Git       → what your code looked like
Flux  → what your code did

These are complementary. Git tells you what changed. Flux tells you what happened when it ran.

What this enables

CapabilityCommandWhat was previously required
Root-cause any failureflux why <id>Grep logs, correlate trace IDs manually
See every DB change a request madeflux state historyDB console + guesswork
Replay an incident safelyflux incident replayNot possible — side-effects make it unsafe
Find the breaking commitflux bug bisectManual git bisect + redeploy loop
Compare two executionsflux trace diffNot possible across tools

Open source and self-hosted

Flux is open source (MIT). The runtime — gateway, runtime engine, data engine, queue — ships as a single binary. Run it on your own infrastructure with one Postgres database as the only dependency.

$ flux init my-app && cd my-app
$ flux dev   # starts Flux + embedded Postgres on localhost:4000

No cloud account. No API keys. No telemetry. Your data stays on your server.

Why a runtime, not an SDK?

The first question developers ask: "Can't I just add a library to my Express app?"

No. Here's why:

CapabilityBolt-on SDKFlux runtime
Trace spans✔ (manual instrumentation)✔ (automatic)
Record DB mutations in the same transaction✗ Sits outside the database driver✔ Data engine wraps the query
Replay production traffic with side-effects disabled✗ Can't intercept outbound calls✔ Runtime controls all I/O
Bisect across git history✗ Can't re-deploy and re-execute per commit✔ Runtime manages deploys
Link every span to exact code version✗ No deploy awareness✔ Knows the code SHA for every execution

An SDK can tell you that something failed. A runtime can tell you exactly what happened and let you replay it.

The tradeoff is explicit: your code runs inside Flux. But migration is wrapping, not rewriting — req.body becomes input, res.json() becomes return, db.query() becomes ctx.db.query(). Start with one endpoint. Run Flux alongside your existing stack. Migrate at your own pace.

What Flux is not

  • Not a log aggregator. Logs are unstructured. Execution records are structured around request IDs and span trees.
  • Not an APM / metrics platform. APMs tell you something is wrong. Flux tells you exactly what happened and lets you replay it.
  • Not a distributed tracing system. Tracing requires SDK instrumentation. Flux records automatically at the runtime level.
  • Not a serverless platform. Flux runs your functions, but it's not a cloud compute layer. It's a self-hosted runtime where the primary value is execution history and debugging — not autoscaling to zero.

← Documentation  ·  Quickstart →