Skip to main content

Ship to Production

Go from docker compose up to a production deployment: build a single binary, configure for production, add observability, and scale.

What You'll Learn

  • Building a release binary with the frontend embedded
  • Configuring forge.toml for production
  • Running migrations safely
  • Setting up health checks and observability
  • Scaling across multiple nodes with worker pools
  • Deploying to Docker, Fly.io, or bare metal

Step 1: Build the Binary

Forge compiles your backend, frontend, and migrations into a single binary. No runtime dependencies beyond PostgreSQL.

SvelteKit Frontend

Use the multi-stage Dockerfile from the demo templates:

# Stage 1: Build frontend
FROM oven/bun:1-alpine AS frontend-builder
WORKDIR /app/frontend
COPY frontend/package.json frontend/bun.lock* ./
RUN bun install --frozen-lockfile || bun install
COPY frontend ./
RUN bun run build

# Stage 2: Build backend with embedded frontend
FROM rust:1-alpine AS builder
WORKDIR /app
RUN apk add --no-cache musl-dev openssl-dev openssl-libs-static pkgconf

COPY Cargo.toml Cargo.lock* ./

# Cache dependencies
RUN mkdir -p src && \
echo "fn main() {}" > src/main.rs && \
cargo build --release --no-default-features && \
rm -rf src

COPY src ./src
COPY migrations ./migrations
COPY .sqlx ./.sqlx
COPY --from=frontend-builder /app/frontend/build ./frontend/build

# Build release with embedded frontend
RUN touch src/main.rs && cargo build --release

# Stage 3: Minimal production image
FROM alpine:3.21 AS production
WORKDIR /app
RUN apk add --no-cache ca-certificates libgcc
COPY --from=builder /app/target/release/my-app /app/my-app
COPY --from=builder /app/migrations /app/migrations
EXPOSE 8080
ENV RUST_LOG=info
CMD ["/app/my-app"]

Dioxus Frontend

cd frontend && dx build --web --release && cd ..
cargo build --release

The embedded-frontend feature is on by default. The binary embeds the compiled frontend using rust-embed and serves it with SPA fallback. During development, disable it with cargo run --no-default-features to run the frontend dev server separately.

Result: one binary (~30MB Alpine image), zero runtime dependencies beyond PostgreSQL.

Step 2: Configure for Production

Create a production forge.toml. Use ${VAR_NAME} syntax for secrets and environment-specific values -- Forge substitutes them at startup.

[project]
name = "my-app"

[database]
url = "${DATABASE_URL}"
pool_size = 50
statement_timeout_secs = 30
# replica_urls = ["${DATABASE_REPLICA_URL}"]
# read_from_replica = true

[gateway]
port = 8080
cors_origins = ["https://myapp.com"]
quiet_routes = ["/_api/health", "/_api/ready"]

[auth]
jwt_algorithm = "HS256"
jwt_secret = "${JWT_SECRET}"

[worker]
max_concurrent_jobs = 20
poll_interval_ms = 100

[observability]
enabled = true
otlp_endpoint = "${OTEL_ENDPOINT}"
log_level = "info"
sampling_ratio = 0.5

Set environment variables for secrets. Never put credentials in forge.toml directly:

export DATABASE_URL="postgres://user:pass@db.example.com:5432/myapp"
export JWT_SECRET="$(openssl rand -base64 32)"
export OTEL_ENDPOINT="http://otel-collector:4318"

Unset variables without defaults remain as literal strings, making misconfiguration easy to detect.

Step 3: Run Migrations

Forge runs migrations automatically on startup using a PostgreSQL advisory lock (pg_advisory_lock(0x464F524745)). Multiple nodes can start simultaneously -- the first one runs migrations, the rest wait.

No manual step is needed. When the binary starts, it:

  1. Acquires the advisory lock (blocks if another node holds it)
  2. Runs any pending migrations from the migrations/ directory
  3. Releases the lock
  4. Starts serving traffic

If you prefer to run migrations explicitly before deploy:

# Using the Forge CLI
forge migrate up

The advisory lock is session-scoped. If a node crashes mid-migration, PostgreSQL releases the lock automatically when the connection drops.

Step 4: Add Observability

Enable OpenTelemetry in forge.toml. Forge auto-instruments everything without code changes:

[observability]
enabled = true
otlp_endpoint = "http://otel-collector:4318"
log_level = "info"
sampling_ratio = 1.0 # sample every trace; lower in high-traffic production
# enable_traces = true
# enable_metrics = true
# enable_logs = true

Forge uses OTLP over HTTP (port 4318), not gRPC (port 4317).

What Gets Instrumented Automatically

SignalWhatDetails
Traceshttp.requestMethod, route, status, duration
Tracesfn.executeFunction name, kind (query/mutation)
Tracesjob.executeJob type, duration, outcome
Metricshttp_requests_totalCounter by method, path, status
Metricshttp_request_duration_secondsHistogram of request latency
Metricsjob_executions_totalCounter by job type and status
Metricsfn.executions_totalCounter by function and kind

Trace Propagation

Every response includes correlation headers:

HeaderDescription
x-request-idUUID for the request
x-trace-idOpenTelemetry trace ID
x-span-idCurrent span ID

Forge extracts inbound traceparent headers per the W3C Trace Context spec, so requests join existing distributed traces from upstream services.

Suppress Noisy Routes

Health check endpoints generate a lot of telemetry noise. Exclude them:

[gateway]
quiet_routes = ["/_api/health", "/_api/ready"]

Console logs always work regardless of the enabled flag. The flag only controls OTLP export.

Step 5: Health Checks

Forge exposes two built-in endpoints:

EndpointPurposeHealthyUnhealthy
/_api/healthLiveness -- is the process alive?200 { status: "healthy", version: "0.1.0" }Process is dead
/_api/readyReadiness -- can it serve traffic?200 { ready: true, database: true }503 { ready: false, database: false }

The readiness probe runs SELECT 1 against the database. Use it for load balancer health checks:

# docker-compose.yml
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/_api/ready"]
interval: 5s
timeout: 5s
retries: 30
start_period: 60s

Step 6: Scale with Multiple Nodes

Deploy the same binary to multiple servers. Nodes discover each other through PostgreSQL -- no Zookeeper, no Raft, no additional infrastructure.

[cluster]
discovery = "postgres"
heartbeat_interval_secs = 5
dead_threshold_secs = 15

[node]
roles = ["gateway", "function", "worker", "scheduler"]

How It Works

  • Leader election: One scheduler node acquires a PostgreSQL advisory lock. Others wait as standbys. If the leader dies, the lock releases and another node takes over within seconds.
  • Job distribution: Workers claim jobs with FOR UPDATE SKIP LOCKED. No double processing, no coordination layer. Add workers, throughput scales linearly.
  • Failure recovery: Nodes send heartbeats every 5 seconds. Missing heartbeats past 15 seconds marks a node as dead. Its jobs are reclaimed automatically.

Separate Roles for Larger Deployments

A single node runs all roles by default. For larger deployments, specialize:

# API nodes (3x) -- handle HTTP, execute functions, no background work
[node]
roles = ["gateway", "function"]

# Worker nodes (5x) -- process jobs, no HTTP traffic
[node]
roles = ["worker"]
worker_capabilities = ["general"]

# Scheduler -- runs crons (deploy 2 for failover, only one is active)
[node]
roles = ["scheduler"]
RoleResponsibility
gatewayHTTP endpoints, SSE subscriptions
functionQuery and mutation execution
workerBackground job processing
schedulerCron scheduling (leader-only)

Step 7: Worker Pools

Route jobs to specialized hardware using worker capabilities:

#[forge::job(worker_capability = "gpu")]
pub async fn train_model(ctx: &JobContext, args: TrainArgs) -> Result<Model> {
// Only workers advertising "gpu" will claim this job
run_training(&args).await
}

Configure the node to advertise its capabilities:

# GPU worker node
[node]
roles = ["worker"]
worker_capabilities = ["gpu"]

[worker]
max_concurrent_jobs = 4 # limited by GPU memory
job_timeout_secs = 7200 # 2 hours for training jobs

Jobs without a worker_capability run on any worker. Workers with multiple capabilities claim jobs targeting any of them. Each capability pool operates independently, so a surge in GPU jobs does not starve media processing.

Step 8: Deploy Checklist

  • Build with cargo build --release (frontend embedded by default)
  • Set DATABASE_URL pointing to PostgreSQL 15+
  • Set JWT_SECRET (or configure JWKS for asymmetric auth)
  • Migrations run automatically on startup -- just deploy
  • Configure cors_origins for your domain
  • Set up health check monitoring on /_api/ready
  • Enable observability ([observability] enabled = true)
  • Size pool_size for your workload (default 50)
  • Consider read replicas for heavy read workloads
  • Set terminationGracePeriodSeconds > 30s (Forge drains for 30s on SIGTERM)

Platform Examples

Docker Compose (Production)

services:
app:
build:
context: .
target: production
ports:
- "8080:8080"
environment:
- DATABASE_URL=postgres://postgres:password@db:5432/myapp
- JWT_SECRET=${JWT_SECRET}
- RUST_LOG=info
depends_on:
db:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/_api/ready"]
interval: 5s
timeout: 5s
retries: 30

db:
image: postgres:18
environment:
POSTGRES_PASSWORD: password
POSTGRES_DB: myapp
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
timeout: 5s
retries: 5

volumes:
postgres_data:

Fly.io

# Create a Postgres database
fly postgres create --name myapp-db

# Deploy the app (uses the Dockerfile)
fly launch
fly secrets set JWT_SECRET="$(openssl rand -base64 32)"
fly secrets set DATABASE_URL="postgres://..."
fly deploy

Add to fly.toml:

[http_service]
internal_port = 8080

[[http_service.checks]]
path = "/_api/ready"
interval = "5s"
timeout = "3s"

Bare Metal (systemd)

# /etc/systemd/system/myapp.service
[Unit]
Description=My Forge App
After=network.target postgresql.service

[Service]
Type=simple
User=myapp
WorkingDirectory=/opt/myapp
ExecStart=/opt/myapp/my-app
Environment=DATABASE_URL=postgres://user:pass@localhost:5432/myapp
Environment=JWT_SECRET=your-secret-here
Environment=RUST_LOG=info
Restart=always
RestartSec=5
TimeoutStopSec=45

[Install]
WantedBy=multi-user.target

TimeoutStopSec=45 exceeds the 30-second drain timeout, giving the process time to finish in-flight requests before systemd force-kills it.