Ship to Production
Go from docker compose up to a production deployment: build a single binary, configure for production, add observability, and scale.
What You'll Learn
- Building a release binary with the frontend embedded
- Configuring
forge.tomlfor production - Running migrations safely
- Setting up health checks and observability
- Scaling across multiple nodes with worker pools
- Deploying to Docker, Fly.io, or bare metal
Step 1: Build the Binary
Forge compiles your backend, frontend, and migrations into a single binary. No runtime dependencies beyond PostgreSQL.
SvelteKit Frontend
Use the multi-stage Dockerfile from the demo templates:
# Stage 1: Build frontend
FROM oven/bun:1-alpine AS frontend-builder
WORKDIR /app/frontend
COPY frontend/package.json frontend/bun.lock* ./
RUN bun install --frozen-lockfile || bun install
COPY frontend ./
RUN bun run build
# Stage 2: Build backend with embedded frontend
FROM rust:1-alpine AS builder
WORKDIR /app
RUN apk add --no-cache musl-dev openssl-dev openssl-libs-static pkgconf
COPY Cargo.toml Cargo.lock* ./
# Cache dependencies
RUN mkdir -p src && \
echo "fn main() {}" > src/main.rs && \
cargo build --release --no-default-features && \
rm -rf src
COPY src ./src
COPY migrations ./migrations
COPY .sqlx ./.sqlx
COPY --from=frontend-builder /app/frontend/build ./frontend/build
# Build release with embedded frontend
RUN touch src/main.rs && cargo build --release
# Stage 3: Minimal production image
FROM alpine:3.21 AS production
WORKDIR /app
RUN apk add --no-cache ca-certificates libgcc
COPY --from=builder /app/target/release/my-app /app/my-app
COPY --from=builder /app/migrations /app/migrations
EXPOSE 8080
ENV RUST_LOG=info
CMD ["/app/my-app"]
Dioxus Frontend
cd frontend && dx build --web --release && cd ..
cargo build --release
The embedded-frontend feature is on by default. The binary embeds the compiled frontend using rust-embed and serves it with SPA fallback. During development, disable it with cargo run --no-default-features to run the frontend dev server separately.
Result: one binary (~30MB Alpine image), zero runtime dependencies beyond PostgreSQL.
Step 2: Configure for Production
Create a production forge.toml. Use ${VAR_NAME} syntax for secrets and environment-specific values -- Forge substitutes them at startup.
[project]
name = "my-app"
[database]
url = "${DATABASE_URL}"
pool_size = 50
statement_timeout_secs = 30
# replica_urls = ["${DATABASE_REPLICA_URL}"]
# read_from_replica = true
[gateway]
port = 8080
cors_origins = ["https://myapp.com"]
quiet_routes = ["/_api/health", "/_api/ready"]
[auth]
jwt_algorithm = "HS256"
jwt_secret = "${JWT_SECRET}"
[worker]
max_concurrent_jobs = 20
poll_interval_ms = 100
[observability]
enabled = true
otlp_endpoint = "${OTEL_ENDPOINT}"
log_level = "info"
sampling_ratio = 0.5
Set environment variables for secrets. Never put credentials in forge.toml directly:
export DATABASE_URL="postgres://user:pass@db.example.com:5432/myapp"
export JWT_SECRET="$(openssl rand -base64 32)"
export OTEL_ENDPOINT="http://otel-collector:4318"
Unset variables without defaults remain as literal strings, making misconfiguration easy to detect.
Step 3: Run Migrations
Forge runs migrations automatically on startup using a PostgreSQL advisory lock (pg_advisory_lock(0x464F524745)). Multiple nodes can start simultaneously -- the first one runs migrations, the rest wait.
No manual step is needed. When the binary starts, it:
- Acquires the advisory lock (blocks if another node holds it)
- Runs any pending migrations from the
migrations/directory - Releases the lock
- Starts serving traffic
If you prefer to run migrations explicitly before deploy:
# Using the Forge CLI
forge migrate up
The advisory lock is session-scoped. If a node crashes mid-migration, PostgreSQL releases the lock automatically when the connection drops.
Step 4: Add Observability
Enable OpenTelemetry in forge.toml. Forge auto-instruments everything without code changes:
[observability]
enabled = true
otlp_endpoint = "http://otel-collector:4318"
log_level = "info"
sampling_ratio = 1.0 # sample every trace; lower in high-traffic production
# enable_traces = true
# enable_metrics = true
# enable_logs = true
Forge uses OTLP over HTTP (port 4318), not gRPC (port 4317).
What Gets Instrumented Automatically
| Signal | What | Details |
|---|---|---|
| Traces | http.request | Method, route, status, duration |
| Traces | fn.execute | Function name, kind (query/mutation) |
| Traces | job.execute | Job type, duration, outcome |
| Metrics | http_requests_total | Counter by method, path, status |
| Metrics | http_request_duration_seconds | Histogram of request latency |
| Metrics | job_executions_total | Counter by job type and status |
| Metrics | fn.executions_total | Counter by function and kind |
Trace Propagation
Every response includes correlation headers:
| Header | Description |
|---|---|
x-request-id | UUID for the request |
x-trace-id | OpenTelemetry trace ID |
x-span-id | Current span ID |
Forge extracts inbound traceparent headers per the W3C Trace Context spec, so requests join existing distributed traces from upstream services.
Suppress Noisy Routes
Health check endpoints generate a lot of telemetry noise. Exclude them:
[gateway]
quiet_routes = ["/_api/health", "/_api/ready"]
Console logs always work regardless of the enabled flag. The flag only controls OTLP export.
Step 5: Health Checks
Forge exposes two built-in endpoints:
| Endpoint | Purpose | Healthy | Unhealthy |
|---|---|---|---|
/_api/health | Liveness -- is the process alive? | 200 { status: "healthy", version: "0.1.0" } | Process is dead |
/_api/ready | Readiness -- can it serve traffic? | 200 { ready: true, database: true } | 503 { ready: false, database: false } |
The readiness probe runs SELECT 1 against the database. Use it for load balancer health checks:
# docker-compose.yml
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/_api/ready"]
interval: 5s
timeout: 5s
retries: 30
start_period: 60s
Step 6: Scale with Multiple Nodes
Deploy the same binary to multiple servers. Nodes discover each other through PostgreSQL -- no Zookeeper, no Raft, no additional infrastructure.
[cluster]
discovery = "postgres"
heartbeat_interval_secs = 5
dead_threshold_secs = 15
[node]
roles = ["gateway", "function", "worker", "scheduler"]
How It Works
- Leader election: One scheduler node acquires a PostgreSQL advisory lock. Others wait as standbys. If the leader dies, the lock releases and another node takes over within seconds.
- Job distribution: Workers claim jobs with
FOR UPDATE SKIP LOCKED. No double processing, no coordination layer. Add workers, throughput scales linearly. - Failure recovery: Nodes send heartbeats every 5 seconds. Missing heartbeats past 15 seconds marks a node as dead. Its jobs are reclaimed automatically.
Separate Roles for Larger Deployments
A single node runs all roles by default. For larger deployments, specialize:
# API nodes (3x) -- handle HTTP, execute functions, no background work
[node]
roles = ["gateway", "function"]
# Worker nodes (5x) -- process jobs, no HTTP traffic
[node]
roles = ["worker"]
worker_capabilities = ["general"]
# Scheduler -- runs crons (deploy 2 for failover, only one is active)
[node]
roles = ["scheduler"]
| Role | Responsibility |
|---|---|
gateway | HTTP endpoints, SSE subscriptions |
function | Query and mutation execution |
worker | Background job processing |
scheduler | Cron scheduling (leader-only) |
Step 7: Worker Pools
Route jobs to specialized hardware using worker capabilities:
#[forge::job(worker_capability = "gpu")]
pub async fn train_model(ctx: &JobContext, args: TrainArgs) -> Result<Model> {
// Only workers advertising "gpu" will claim this job
run_training(&args).await
}
Configure the node to advertise its capabilities:
# GPU worker node
[node]
roles = ["worker"]
worker_capabilities = ["gpu"]
[worker]
max_concurrent_jobs = 4 # limited by GPU memory
job_timeout_secs = 7200 # 2 hours for training jobs
Jobs without a worker_capability run on any worker. Workers with multiple capabilities claim jobs targeting any of them. Each capability pool operates independently, so a surge in GPU jobs does not starve media processing.
Step 8: Deploy Checklist
- Build with
cargo build --release(frontend embedded by default) - Set
DATABASE_URLpointing to PostgreSQL 15+ - Set
JWT_SECRET(or configure JWKS for asymmetric auth) - Migrations run automatically on startup -- just deploy
- Configure
cors_originsfor your domain - Set up health check monitoring on
/_api/ready - Enable observability (
[observability] enabled = true) - Size
pool_sizefor your workload (default 50) - Consider read replicas for heavy read workloads
- Set
terminationGracePeriodSeconds> 30s (Forge drains for 30s on SIGTERM)
Platform Examples
Docker Compose (Production)
services:
app:
build:
context: .
target: production
ports:
- "8080:8080"
environment:
- DATABASE_URL=postgres://postgres:password@db:5432/myapp
- JWT_SECRET=${JWT_SECRET}
- RUST_LOG=info
depends_on:
db:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/_api/ready"]
interval: 5s
timeout: 5s
retries: 30
db:
image: postgres:18
environment:
POSTGRES_PASSWORD: password
POSTGRES_DB: myapp
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5
volumes:
postgres_data:
Fly.io
# Create a Postgres database
fly postgres create --name myapp-db
# Deploy the app (uses the Dockerfile)
fly launch
fly secrets set JWT_SECRET="$(openssl rand -base64 32)"
fly secrets set DATABASE_URL="postgres://..."
fly deploy
Add to fly.toml:
[http_service]
internal_port = 8080
[[http_service.checks]]
path = "/_api/ready"
interval = "5s"
timeout = "3s"
Bare Metal (systemd)
# /etc/systemd/system/myapp.service
[Unit]
Description=My Forge App
After=network.target postgresql.service
[Service]
Type=simple
User=myapp
WorkingDirectory=/opt/myapp
ExecStart=/opt/myapp/my-app
Environment=DATABASE_URL=postgres://user:pass@localhost:5432/myapp
Environment=JWT_SECRET=your-secret-here
Environment=RUST_LOG=info
Restart=always
RestartSec=5
TimeoutStopSec=45
[Install]
WantedBy=multi-user.target
TimeoutStopSec=45 exceeds the 30-second drain timeout, giving the process time to finish in-flight requests before systemd force-kills it.