Production Guide
Everything you need to know before running Flux in production.
Performance overhead
Span recording adds <1ms to typical requests (async, off the critical path). Mutation logging adds ~0.2ms per write batched in the same transaction. No SDK, no instrumentation — overhead is built into the infrastructure layer.
Recording every execution adds overhead at two points:
| What is recorded | Overhead | How it is minimised |
|---|---|---|
| Span recording (gateway, function, tool calls) | ~0.1ms per span, async | Written after the response is flushed — never on the critical path |
| Mutation logging (DB writes) | ~0.2ms per mutation | Batched in the same transaction at the Data Engine level — no extra round-trip to Postgres |
| Request input recording | ~0.05ms | Captured at gateway ingress before routing |
Typical observed overhead
- p50 latency added: <0.5ms
- p99 latency added: <2ms
- No CPU overhead on the runtime isolate (recording runs in a separate Rust thread)
If a span write fails (e.g. storage is temporarily unavailable), the request still completes normally. Recording failures are non-fatal and are retried asynchronously.
Cold starts & isolate reuse
Are functions warm? Yes — Deno V8 isolates are pre-warmed and pooled per project. A fresh isolate is initialised at deploy time and kept alive across requests.
Are isolates reused? Yes, within a request window. The runtime reuses the same isolate for consecutive requests to the same function, similar to how Node.js reuses a process. Each request gets a fresh execution context (no shared state between requests) but the V8 heap is warmed.
What causes a cold start?
- First deploy to a new environment
- Isolate eviction after a long period of inactivity (>15 minutes on Free, >1 hour on Builder+)
- Runtime update (triggered by Flux infrastructure, not your deployments)
Typical cold start time: 80–150ms for a function with no large dependencies.
Concurrency model
Each function runs inside a sandboxed Deno V8 isolate. The runtime allocates isolates in a pool per project:
| Plan | Concurrent requests per project | Isolate pool size |
|---|---|---|
| Free | 10 | 2 |
| Builder | 50 | 10 |
| Pro | 200 | 40 |
| Enterprise | Custom | Custom |
Requests above the concurrency limit are queued for up to 5 seconds, then return a 503 with a Retry-After header. Execution records are still created for queued and rejected requests.
Trace storage
Execution records are stored in three tables in your project's Postgres database, managed by the Data Engine:
| Table | Contains | Indexed by |
|---|---|---|
trace_requests | Request metadata: method, path, status, timing | request_id, created_at |
execution_spans | Span tree: name, parent, start, duration, error | request_id, span_id |
state_mutations | DB mutations: table, row, column, old value, new value | request_id, table_name, row_id |
You can query these tables directly with standard Postgres tooling. They are owned by the flux schema and are read-only outside the Data Engine.
Storage estimates
- ~2–5 KB per execution record (spans only)
- ~1 KB per database mutation
- A project with 1M monthly executions and 2 mutations/request uses roughly 7–10 GB/month before retention pruning
How replay guarantees determinism
Replay works because every execution record includes the complete input to the request — HTTP body, headers, auth context. Re-running that input against your current code produces a new execution from an identical starting point.
To prevent side-effects during replay, the runtime intercepts all outbound I/O at the isolate boundary:
- HTTP calls — intercepted; a stub returns a recorded response from the original trace (or fails with a clear error if no recording exists)
- Email / SMS — intercepted; send calls are no-ops that return
{ status: 'stubbed' } - Webhooks — intercepted; HTTP POST to external URLs returns 200 without making a real request
- Cron / async jobs — job enqueue calls are recorded but the jobs are not dispatched
- Database writes — pass through normally; mutations are recorded in the mutation log for comparison
Because external HTTP calls return the same recorded response from the original trace, your function sees the same data at every step — even if the external service would return different data today.
Data retention
| Plan | Execution records | Mutation log | Replay window |
|---|---|---|---|
| Free | 14 days | 14 days | 14 days |
| Builder | 30 days | 30 days | 30 days |
| Pro | 90 days | 90 days | 90 days |
| Enterprise | Custom | Custom | Custom |
Records older than the retention window are pruned nightly. Pruning removes rows from trace_requests, execution_spans, and state_mutations. Your application data in other tables is never affected.
Scaling
Gateway — The gateway is a Rust binary that scales horizontally. It is stateless; all state is in Postgres. Load balancing and zero-downtime deploys are handled by the Flux infrastructure layer.
Runtime — The runtime pool scales vertically (more isolates per node) up to the plan concurrency limit, and horizontally beyond that on Pro and Enterprise plans.
Data Engine — The Data Engine is co-located with your Postgres instance. Write throughput scales with your Postgres plan. For very high mutation volumes, the Data Engine batches mutation log writes to avoid write amplification.
Trace storage — The execution_spans and state_mutations tables are partitioned by day. Reads for long time ranges are query-planned against partitions only; old partitions are dropped atomically at retention cutoff without locking live traffic.