Production Guide

Everything you need to know before running Flux in production.

Running on your own infrastructure?
This page covers the Flux Cloud production environment. If you are self-hosting, see the Self-Hosting guide — the operational model (stateless services, one Postgres) is the same, but scaling and configuration are in your hands.

Performance overhead

TL;DR

Span recording adds <1ms to typical requests (async, off the critical path). Mutation logging adds ~0.2ms per write batched in the same transaction. No SDK, no instrumentation — overhead is built into the infrastructure layer.

Recording every execution adds overhead at two points:

What is recordedOverheadHow it is minimised
Span recording (gateway, function, tool calls) ~0.1ms per span, async Written after the response is flushed — never on the critical path
Mutation logging (DB writes) ~0.2ms per mutation Batched in the same transaction at the Data Engine level — no extra round-trip to Postgres
Request input recording ~0.05ms Captured at gateway ingress before routing

Typical observed overhead

  • p50 latency added: <0.5ms
  • p99 latency added: <2ms
  • No CPU overhead on the runtime isolate (recording runs in a separate Rust thread)

If a span write fails (e.g. storage is temporarily unavailable), the request still completes normally. Recording failures are non-fatal and are retried asynchronously.

Cold starts & isolate reuse

Are functions warm? Yes — Deno V8 isolates are pre-warmed and pooled per project. A fresh isolate is initialised at deploy time and kept alive across requests.

Are isolates reused? Yes, within a request window. The runtime reuses the same isolate for consecutive requests to the same function, similar to how Node.js reuses a process. Each request gets a fresh execution context (no shared state between requests) but the V8 heap is warmed.

What causes a cold start?

  • First deploy to a new environment
  • Isolate eviction after a long period of inactivity (>15 minutes on Free, >1 hour on Builder+)
  • Runtime update (triggered by Flux infrastructure, not your deployments)

Typical cold start time: 80–150ms for a function with no large dependencies.

Concurrency model

Each function runs inside a sandboxed Deno V8 isolate. The runtime allocates isolates in a pool per project:

PlanConcurrent requests per projectIsolate pool size
Free102
Builder5010
Pro20040
EnterpriseCustomCustom

Requests above the concurrency limit are queued for up to 5 seconds, then return a 503 with a Retry-After header. Execution records are still created for queued and rejected requests.

Trace storage

Execution records are stored in three tables in your project's Postgres database, managed by the Data Engine:

TableContainsIndexed by
trace_requestsRequest metadata: method, path, status, timingrequest_id, created_at
execution_spansSpan tree: name, parent, start, duration, errorrequest_id, span_id
state_mutationsDB mutations: table, row, column, old value, new valuerequest_id, table_name, row_id

You can query these tables directly with standard Postgres tooling. They are owned by the flux schema and are read-only outside the Data Engine.

Storage estimates

  • ~2–5 KB per execution record (spans only)
  • ~1 KB per database mutation
  • A project with 1M monthly executions and 2 mutations/request uses roughly 7–10 GB/month before retention pruning

How replay guarantees determinism

Replay works because every execution record includes the complete input to the request — HTTP body, headers, auth context. Re-running that input against your current code produces a new execution from an identical starting point.

To prevent side-effects during replay, the runtime intercepts all outbound I/O at the isolate boundary:

  • HTTP calls — intercepted; a stub returns a recorded response from the original trace (or fails with a clear error if no recording exists)
  • Email / SMS — intercepted; send calls are no-ops that return { status: 'stubbed' }
  • Webhooks — intercepted; HTTP POST to external URLs returns 200 without making a real request
  • Cron / async jobs — job enqueue calls are recorded but the jobs are not dispatched
  • Database writes — pass through normally; mutations are recorded in the mutation log for comparison

Because external HTTP calls return the same recorded response from the original trace, your function sees the same data at every step — even if the external service would return different data today.

Data retention

PlanExecution recordsMutation logReplay window
Free14 days14 days14 days
Builder30 days30 days30 days
Pro90 days90 days90 days
EnterpriseCustomCustomCustom

Records older than the retention window are pruned nightly. Pruning removes rows from trace_requests, execution_spans, and state_mutations. Your application data in other tables is never affected.

Scaling

Gateway — The gateway is a Rust binary that scales horizontally. It is stateless; all state is in Postgres. Load balancing and zero-downtime deploys are handled by the Flux infrastructure layer.

Runtime — The runtime pool scales vertically (more isolates per node) up to the plan concurrency limit, and horizontally beyond that on Pro and Enterprise plans.

Data Engine — The Data Engine is co-located with your Postgres instance. Write throughput scales with your Postgres plan. For very high mutation volumes, the Data Engine batches mutation log writes to avoid write amplification.

Trace storage — The execution_spans and state_mutations tables are partitioned by day. Reads for long time ranges are query-planned against partitions only; old partitions are dropped atomically at retention cutoff without locking live traffic.


← Examples  ·  Self-Hosting →