Production Architecture
Understand how a Forge process is structured, what roles it can play, and what a minimal production deployment looks like before you wire up containers and load balancers.
Single-Binary, All Subsystems
Every Forge application compiles to one binary. When it starts, it brings up all of the following subsystems in a single process:
┌─────────────────────────────────────────────────────┐
│ forge binary │
│ │
│ ┌────────────┐ ┌────────────┐ ┌───────────────┐ │
│ │ Gateway │ │ Worker │ │ Scheduler │ │
│ │ (Axum) │ │ (jobs) │ │ (cron/leader)│ │
│ └────────────┘ └────────────┘ └───────────────┘ │
│ │
│ ┌────────────┐ ┌────────────┐ ┌───────────────┐ │
│ │ Reactor │ │ Daemons │ │ Workflow │ │
│ │ (SSE/RT) │ │ │ │ Executor │ │
│ └────────────┘ └────────────┘ └───────────────┘ │
│ │
│ ↕ PostgreSQL │
└─────────────────────────────────────────────────────┘
There is no separate worker process, no separate scheduler, no sidecar. One binary, one PostgreSQL database. You deploy more copies of the same binary when you need to scale or add redundancy.
What each subsystem does
| Subsystem | Role |
|---|---|
| Gateway | Axum HTTP server. Handles RPC (/_api/rpc/*), SSE (/_api/events), health probes, and static frontend assets. |
| Worker | Polls forge_jobs with FOR UPDATE SKIP LOCKED. Executes background jobs concurrently, bounded by the semaphore size in [worker]. |
| Scheduler | Triggers cron jobs on schedule. Only the elected leader node runs this — others stand by. |
| Reactor | Listens on the PostgreSQL forge_changes NOTIFY channel. On a change, debounces, re-executes affected queries, and pushes diffs to connected SSE clients. |
| Daemons | Long-running background loops. Either leader-only (one instance per cluster via advisory lock) or replicated (one per node). |
| Workflow Executor | Resumes durable workflows from their persisted checkpoint. Handles step re-execution, compensation, and durable sleep. |
Node Roles
By default every node runs every subsystem. You can restrict what a node does via forge.toml:
[node]
roles = ["gateway", "worker", "scheduler", "function"]
| Role | Enables |
|---|---|
gateway | HTTP server and SSE endpoint |
function | Query and mutation execution |
worker | Background job processing |
scheduler | Cron scheduling (leader-elected) |
All four roles enabled is the right default for single-node and small multi-node deployments. Split roles when you need to isolate concerns — for example, to put gateway nodes behind a WAF while worker nodes have no inbound HTTP.
Deployment Topologies
Single node (development, staging, small apps)
Internet → [ forge binary ] → PostgreSQL
One node, all roles, one database. This is what cargo run gives you locally and what the Docker Compose in Deploy sets up. It handles hundreds of concurrent connections before you need anything else.
No load balancer needed. No cluster config needed. Migrations run on startup and block until complete.
Minimum viable production: two nodes
Internet → [ Load Balancer ]
↙ ↘
[ forge node A ] [ forge node B ]
↘ ↙
[ PostgreSQL ]
Two nodes, all roles, one PostgreSQL instance, one load balancer. This gives you:
- Zero-downtime deploys (rolling update: start node B, drain node A)
- Failover if one node crashes (the other keeps serving and claims orphaned jobs within 15 seconds)
- Double the worker throughput
The load balancer routes based on /_api/ready. Nodes that are starting up (joining) or shutting down (draining) return 503 from /_api/ready and drop out of rotation automatically.
# forge.toml — same file on both nodes
[cluster]
discovery = "postgres"
[node]
roles = ["gateway", "worker", "scheduler", "function"]
One node wins the scheduler advisory lock. The other stands by. If the leader crashes, the standby acquires the lock within the next heartbeat interval (default 5 seconds).
Separated concerns: API + worker nodes
For higher throughput or to isolate HTTP traffic from CPU-heavy job processing:
Internet → [ Load Balancer ]
↙ ↘
[ API node ] [ API node ] (gateway + function, no worker)
↕
[ PostgreSQL ]
↕
[ Worker node ] [ Worker node ] (worker only, no gateway)
# API nodes
[node]
roles = ["gateway", "function"]
# Worker nodes
[node]
roles = ["worker"]
worker_capabilities = ["general"]
Worker nodes do not bind an HTTP port. They connect to PostgreSQL and poll for jobs. You can scale worker and API nodes independently. See Worker Pools for capability-based routing.
Leader Election
Certain subsystems run on exactly one node at a time:
- Scheduler — triggers cron jobs; duplicate execution would double-fire scheduled tasks
- Leader-mode daemons — daemons marked as leader-only in their config
Election uses a PostgreSQL advisory lock. The first node to acquire pg_try_advisory_lock(0x464F52470001) becomes the scheduler leader. If that node crashes, its database connection closes, PostgreSQL releases the lock, and another node acquires it within seconds.
No quorum, no Raft, no Zookeeper. The database connection is the lease. Clock skew cannot cause split-brain because the lock is not time-based.
Configuration for Production
A production forge.toml sets the sections that matter. Environment variable substitution (${VAR} and ${VAR-default}) works in any string value.
[project]
name = "my-app"
[database]
url = "${DATABASE_URL}"
pool_size = 20 # tune for your workload and PG max_connections
[gateway]
port = 8080
host = "0.0.0.0"
[cluster]
discovery = "postgres"
heartbeat_interval = "5s"
dead_threshold = "15s"
[node]
roles = ["gateway", "worker", "scheduler", "function"]
[worker]
concurrency = 16 # jobs processed simultaneously per node
[auth]
jwt_secret = "${JWT_SECRET}"
[observability]
enabled = true
otlp_endpoint = "${OTEL_EXPORTER_OTLP_ENDPOINT}" # e.g. http://collector:4318
Key environment variables:
| Variable | Required | Description |
|---|---|---|
DATABASE_URL | Yes | postgres://user:pass@host:5432/db |
JWT_SECRET | If using auth | Minimum 32 bytes |
RUST_LOG | No | info for production, debug for troubleshooting |
No other environment variables are required. Everything else lives in forge.toml.
Health Checks
Both endpoints are always available when the gateway role is enabled.
| Endpoint | Probe type | Returns |
|---|---|---|
/_api/health | Liveness | 200 always (process is up) |
/_api/ready | Readiness | 200 when DB reachable and reactor ready; 503 otherwise |
Use /_api/health as the liveness probe (restart if the process is wedged). Use /_api/ready as the readiness probe (only route traffic here when it returns 200).
The readiness probe also returns 503 when in-flight workflow runs exist for a handler version that is no longer registered — it forces you to drain stranded workflows before the node accepts new traffic.
# Kubernetes
livenessProbe:
httpGet:
path: /_api/health
port: 8080
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /_api/ready
port: 8080
periodSeconds: 5
failureThreshold: 1
What You Need to Operate
A complete Forge production deployment requires:
- PostgreSQL 18 — all state lives here: jobs, workflows, sessions, signals, node registry, cron schedule
- One or more instances of your binary — same binary, any number of nodes
- A load balancer — routes to healthy nodes via
/_api/ready; sticky sessions needed only for MCP OAuth (/_api/oauth/*) - No other infrastructure — no Redis, no message bus, no separate worker process, no service mesh required
Optional but recommended for production at scale:
- Read replicas — configure under
[database.replicas]to offload query traffic; see Multiple Nodes - OTLP collector — for distributed traces and metrics; configure
[observability] - Connection pooler (PgBouncer or RDS Proxy) — if you run many nodes and approach PostgreSQL's
max_connectionslimit