Production Guide

Everything you need to know before running Flux in production.

Running on your own infrastructure?

This page covers the Flux Cloud production environment. If you are self-hosting, see the Self-Hosting guide — the operational model (stateless services, one Postgres) is the same, but scaling and configuration are in your hands.

Performance overhead

TL;DR

Span recording adds <1ms to typical requests (async, off the critical path). Mutation logging adds ~0.2ms per write batched in the same transaction. No SDK, no instrumentation — overhead is built into the infrastructure layer.

Recording every execution adds overhead at two points:

What is recorded	Overhead	How it is minimised
Span recording (gateway, function, tool calls)	~0.1ms per span, async	Written after the response is flushed — never on the critical path
Mutation logging (DB writes)	~0.2ms per mutation	Batched in the same transaction at the Data Engine level — no extra round-trip to Postgres
Request input recording	~0.05ms	Captured at gateway ingress before routing

Typical observed overhead

p50 latency added: <0.5ms
p99 latency added: <2ms
No CPU overhead on the runtime isolate (recording runs in a separate Rust thread)

If a span write fails (e.g. storage is temporarily unavailable), the request still completes normally. Recording failures are non-fatal and are retried asynchronously.

Cold starts & isolate reuse

Are functions warm? Yes — Deno V8 isolates are pre-warmed and pooled per project. A fresh isolate is initialised at deploy time and kept alive across requests.

Are isolates reused? Yes, within a request window. The runtime reuses the same isolate for consecutive requests to the same function, similar to how Node.js reuses a process. Each request gets a fresh execution context (no shared state between requests) but the V8 heap is warmed.

What causes a cold start?

First deploy to a new environment
Isolate eviction after a long period of inactivity (>15 minutes on Free, >1 hour on Builder+)
Runtime update (triggered by Flux infrastructure, not your deployments)

Typical cold start time: 80–150ms for a function with no large dependencies.

Concurrency model

Each function runs inside a sandboxed Deno V8 isolate. The runtime allocates isolates in a pool per project:

Plan	Concurrent requests per project	Isolate pool size
Free	10	2
Builder	50	10
Pro	200	40
Enterprise	Custom	Custom

Requests above the concurrency limit are queued for up to 5 seconds, then return a 503 with a Retry-After header. Execution records are still created for queued and rejected requests.

Trace storage

Execution records are stored in three tables in your project's Postgres database, managed by the Data Engine:

Table	Contains	Indexed by
`trace_requests`	Request metadata: method, path, status, timing	`request_id`, `created_at`
`execution_spans`	Span tree: name, parent, start, duration, error	`request_id`, `span_id`
`state_mutations`	DB mutations: table, row, column, old value, new value	`request_id`, `table_name`, `row_id`

You can query these tables directly with standard Postgres tooling. They are owned by the flux schema and are read-only outside the Data Engine.

Storage estimates

~2–5 KB per execution record (spans only)
~1 KB per database mutation
A project with 1M monthly executions and 2 mutations/request uses roughly 7–10 GB/month before retention pruning

How replay guarantees determinism

Replay works because every execution record includes the complete input to the request — HTTP body, headers, auth context. Re-running that input against your current code produces a new execution from an identical starting point.

To prevent side-effects during replay, the runtime intercepts all outbound I/O at the isolate boundary:

HTTP calls — intercepted; a stub returns a recorded response from the original trace (or fails with a clear error if no recording exists)
Email / SMS — intercepted; send calls are no-ops that return { status: 'stubbed' }
Webhooks — intercepted; HTTP POST to external URLs returns 200 without making a real request
Cron / async jobs — job enqueue calls are recorded but the jobs are not dispatched
Database writes — pass through normally; mutations are recorded in the mutation log for comparison

Because external HTTP calls return the same recorded response from the original trace, your function sees the same data at every step — even if the external service would return different data today.

Data retention

Plan	Execution records	Mutation log	Replay window
Free	14 days	14 days	14 days
Builder	30 days	30 days	30 days
Pro	90 days	90 days	90 days
Enterprise	Custom	Custom	Custom

Records older than the retention window are pruned nightly. Pruning removes rows from trace_requests, execution_spans, and state_mutations. Your application data in other tables is never affected.

Scaling

Gateway — The gateway is a Rust binary that scales horizontally. It is stateless; all state is in Postgres. Load balancing and zero-downtime deploys are handled by the Flux infrastructure layer.

Runtime — The runtime pool scales vertically (more isolates per node) up to the plan concurrency limit, and horizontally beyond that on Pro and Enterprise plans.

Data Engine — The Data Engine is co-located with your Postgres instance. Write throughput scales with your Postgres plan. For very high mutation volumes, the Data Engine batches mutation log writes to avoid write amplification.

Trace storage — The execution_spans and state_mutations tables are partitioned by day. Reads for long time ranges are query-planned against partitions only; old partitions are dropped atomically at retention cutoff without locking live traffic.

← Examples · Self-Hosting →