Skip to main content

Configuration

Configure database connections, authentication, workers, clustering, and node roles in forge.toml.

The Code

[project]
name = "my-app"

[database]
url = "${DATABASE_URL}"
pool_size = 50
replica_urls = ["${DATABASE_REPLICA_URL}"]

[gateway]
port = 8080

[worker]
max_concurrent_jobs = 10
poll_interval_ms = 100

[auth]
jwt_algorithm = "RS256"
jwks_url = "https://www.googleapis.com/service_accounts/v1/jwk/securetoken@system.gserviceaccount.com"
jwt_issuer = "https://securetoken.google.com/my-project"

[node]
roles = ["gateway", "worker", "scheduler"]
worker_capabilities = ["general", "media"]

What Happens

Forge reads forge.toml at startup and substitutes environment variables. Each section configures a different subsystem. Sections you omit use sensible defaults.

Environment variables use ${VAR_NAME} syntax (uppercase letters, numbers, underscores). Default values are supported with ${VAR-default} or ${VAR:-default} syntax. Unset variables without defaults remain as literal strings.

Sections

[project]

OptionTypeDefaultDescription
namestring"forge-app"Project identifier
versionstring"0.1.0"Project version

[database]

OptionTypeDefaultDescription
urlstring-PostgreSQL connection URL
pool_sizeu3250Connection pool size
pool_timeout_secsu6430Pool checkout timeout
statement_timeout_secsu6430Query timeout
replica_urlsstring[][]Read replica URLs
read_from_replicaboolfalseRoute reads to replicas
[database]
url = "${DATABASE_URL}"

During development, forge dev runs PostgreSQL via Docker Compose. For production, provide a DATABASE_URL pointing to your PostgreSQL instance.

Read Replicas

[database]
url = "${DATABASE_URL}"
replica_urls = [
"${DATABASE_REPLICA_1}",
"${DATABASE_REPLICA_2}"
]
read_from_replica = true

Queries route to healthy replicas via round-robin. Mutations always use the primary. A background monitor pings each replica every 15 seconds and removes unhealthy ones from rotation. If all replicas fail, reads fall back to primary.

Queries that need read-after-write consistency can use #[forge::query(consistent)] to bypass replicas and read directly from the primary.

Pool Isolation (Bulkhead)

Separate connection pools prevent runaway workloads from starving others:

[database]
url = "${DATABASE_URL}"
pool_size = 50

[database.pools.default]
size = 30
timeout_secs = 30

[database.pools.jobs]
size = 15
timeout_secs = 60
statement_timeout_secs = 300

[database.pools.analytics]
size = 5
timeout_secs = 120
statement_timeout_secs = 600

[database.pools.observability]
size = 3
timeout_secs = 5
statement_timeout_secs = 10

Available pool names and what uses them:

PoolUsed By
defaultQueries, mutations, rate limiter, reactor, cluster coordination
jobsJob workers, cron runners, daemon processes, workflow executors
analyticsAvailable via db.analytics_pool() for user code
observabilityInternal metrics collection (pool utilization, slow query tracking)

Without pool isolation configured, everything shares the primary pool. With it configured, a spike in background job processing cannot starve user-facing query connections. Each pool enforces independent connection limits, checkout timeouts, and statement timeouts.

[gateway]

OptionTypeDefaultDescription
portu168080HTTP port
grpc_portu169000Inter-node communication port
max_connectionsusize4096Maximum concurrent connections
request_timeout_secsu6430Request timeout
cors_enabledboolfalseEnable CORS handling
cors_originsstring[][]Allowed CORS origins (use ["*"] for any)
quiet_routesstring[]["/_api/health", "/_api/ready"]Routes excluded from traces, metrics, and logs

[function]

Controls query and mutation execution limits.

OptionTypeDefaultDescription
max_concurrentusize1000Maximum concurrent function executions
timeout_secsu6430Function execution timeout
memory_limitusize536870912Memory limit per function (bytes, 512 MiB)
[function]
max_concurrent = 1000
timeout_secs = 30
memory_limit = 536870912 # 512 MiB

The memory limit is advisory. Functions exceeding this limit may be terminated. Set appropriately for your workload.

[security]

Reserved security settings parsed from config for forward compatibility.

OptionTypeDefaultDescription
secret_keystring-Reserved; currently not used by the runtime
[security]
secret_key = "${FORGE_SECRET_KEY}"

Generate a secure key if you want to populate this value ahead of future runtime support:

openssl rand -base64 32

[auth]

OptionTypeDefaultDescription
jwt_algorithmstring"HS256"Signing algorithm
jwt_secretstring-Secret for HMAC algorithms
jwks_urlstring-JWKS endpoint for RSA algorithms
jwks_cache_ttl_secsu643600Public key cache duration
jwt_issuerstring-Expected issuer (optional)
jwt_audiencestring-Expected audience (optional)
token_expirystring-Optional app-level convention; not applied automatically by ctx.issue_token()
session_ttl_secsu64604800Session TTL (7 days)

HMAC (Symmetric)

[auth]
jwt_algorithm = "HS256" # or HS384, HS512
jwt_secret = "${JWT_SECRET}"

RSA with JWKS (Asymmetric)

[auth]
jwt_algorithm = "RS256" # or RS384, RS512
jwks_url = "https://your-provider.com/.well-known/jwks.json"
jwt_issuer = "https://your-provider.com"
jwt_audience = "your-app-id"

Common JWKS URLs:

ProviderJWKS URL
Firebasehttps://www.googleapis.com/service_accounts/v1/jwk/securetoken@system.gserviceaccount.com
Auth0https://YOUR_DOMAIN.auth0.com/.well-known/jwks.json
Clerkhttps://YOUR_DOMAIN.clerk.accounts.dev/.well-known/jwks.json
Supabasehttps://YOUR_PROJECT.supabase.co/auth/v1/jwks

[mcp]

Controls Forge MCP server exposure on Streamable HTTP.

OptionTypeDefaultDescription
enabledboolfalseEnable MCP endpoint
pathstring"/mcp"MCP endpoint path under /_api
session_ttl_secsu643600MCP session lifetime
allowed_originsstring[][]Allowed Origin values
require_protocol_version_headerbooltrueRequire MCP-Protocol-Version header after initialize
[mcp]
enabled = true
path = "/mcp"
session_ttl_secs = 3600
allowed_origins = ["https://your-app.example"]
require_protocol_version_header = true

With default API routing, path = "/mcp" resolves to /_api/mcp.

[worker]

OptionTypeDefaultDescription
max_concurrent_jobsusize50Concurrent job limit per worker
job_timeout_secsu643600Default job timeout (1 hour)
poll_interval_msu64100Queue polling interval

Workers maintain a semaphore sized to max_concurrent_jobs. They only poll when permits are available. Backpressure propagates naturally.

[cluster]

OptionTypeDefaultDescription
namestring"default"Cluster identifier
discoverystring"postgres"Discovery method; the current runtime only implements postgres
heartbeat_interval_secsu645Heartbeat frequency
dead_threshold_secsu6415Missing heartbeats before dead
seed_nodesstring[][]Static seed node addresses (for static discovery)
dns_namestring-DNS name for service discovery (for dns discovery)

Discovery

Nodes register in the forge_nodes database table by default, so an external service is not required. The current runtime only implements Postgres-backed discovery; other configured discovery values are parsed but ignored with a warning.

[cluster]
discovery = "postgres"

[node]

OptionTypeDefaultDescription
rolesstring[]all rolesRoles this node assumes
worker_capabilitiesstring[]["general"]Job routing capabilities

Node Roles

RoleResponsibility
gatewayHTTP/gRPC endpoints, SSE subscriptions
functionQuery and mutation execution
workerBackground job processing
schedulerCron scheduling, leader election

Single-node deployment (default):

[node]
roles = ["gateway", "function", "worker", "scheduler"]

API-only node:

[node]
roles = ["gateway", "function"]

Worker-only node:

[node]
roles = ["worker"]
worker_capabilities = ["gpu", "ml"]

Scheduler node (singleton per cluster):

[node]
roles = ["scheduler"]

Multiple nodes can run Scheduler. Advisory locks ensure only one is active. Others wait as standbys.

Worker Capabilities

Route jobs to specific workers:

# GPU worker
[node]
roles = ["worker"]
worker_capabilities = ["gpu"]

# General purpose worker
[node]
roles = ["worker"]
worker_capabilities = ["general", "media"]

Jobs requiring worker_capability = "gpu" only run on workers with that capability. Jobs without a capability requirement run on any worker.

[observability]

OTLP-based telemetry for traces, metrics, and logs. Disabled by default. When enabled, Forge auto-instruments HTTP requests, function calls, job execution, and database queries without any application code changes.

OptionTypeDefaultDescription
enabledboolfalseEnable OTLP telemetry export
otlp_endpointstring"http://localhost:4318"OTLP collector endpoint (HTTP)
service_namestringproject nameService name in telemetry data
enable_tracesbooltrueExport distributed traces
enable_metricsbooltrueExport metrics
enable_logsbooltrueExport logs via OTLP
sampling_ratiof641.0Trace sampling ratio (0.0 to 1.0)
log_levelstring"info"Log level for the tracing subscriber
[observability]
enabled = true
otlp_endpoint = "http://localhost:4318"
sampling_ratio = 0.5

Requires an OTLP-compatible collector (Jaeger, Grafana Alloy, OpenTelemetry Collector, etc).

What Gets Instrumented

With enabled = true, Forge automatically creates spans and records metrics for:

  • HTTP requests (http.request span): method, route, status code, duration, trace ID, request ID
  • Function calls (fn.execute span): function name, kind (query/mutation), duration
  • Job execution (job.execute span): job ID, job type, duration, outcome (completed/retrying/failed/timeout)
  • Database queries: operation, table, duration, connection pool utilization

Slow queries (over 500ms) emit a warning automatically. Database pool metrics (size, active, idle, waiting) are recorded every 15 seconds.

Routes listed in [gateway].quiet_routes are excluded from all telemetry. Health and readiness probes are excluded by default to avoid noise from Kubernetes liveness checks. Set quiet_routes = [] to monitor everything.

Console logs always work regardless of the enabled flag. The flag only controls OTLP export.

Patterns

Development

Development uses forge dev which runs Docker Compose. The forge.toml in generated projects uses ${DATABASE_URL} which is set by the Docker Compose environment:

[project]
name = "my-app"

[database]
url = "${DATABASE_URL}"

[gateway]
port = 8080

Production Single Node

[project]
name = "my-app"

[database]
url = "${DATABASE_URL}"
pool_size = 100

[gateway]
port = 8080

[auth]
jwt_algorithm = "RS256"
jwks_url = "${JWKS_URL}"
jwt_issuer = "${JWT_ISSUER}"
jwt_audience = "${JWT_AUDIENCE}"

[worker]
max_concurrent_jobs = 20

Production Multi-Node

API nodes:

[database]
url = "${DATABASE_URL}"
replica_urls = ["${DATABASE_REPLICA_URL}"]
read_from_replica = true

[database.pools.default]
size = 40

[node]
roles = ["gateway", "function"]

[cluster]
discovery = "postgres"

Worker nodes:

[database]
url = "${DATABASE_URL}"

[database.pools.jobs]
size = 30
statement_timeout_secs = 600

[node]
roles = ["worker"]
worker_capabilities = ["general"]

[worker]
max_concurrent_jobs = 25

[cluster]
discovery = "postgres"

Specialized Workers

GPU processing node:

[node]
roles = ["worker"]
worker_capabilities = ["gpu"]

[worker]
max_concurrent_jobs = 4 # GPU memory limits concurrency
job_timeout_secs = 7200 # 2 hours for training jobs

Under the Hood

Environment Variable Substitution

Variables match the pattern ${VAR_NAME} where VAR_NAME contains uppercase letters, numbers, and underscores:

let re = Regex::new(r"\$\{([A-Z_][A-Z0-9_]*)\}")?;

Substitution happens at parse time. Unset variables remain as literal ${VAR_NAME} strings (useful for detecting misconfiguration).

Bulkhead Isolation

Connection pools isolate workloads:

┌─────────────────────────────────────────────────┐
│ PostgreSQL │
└─────────────────────────────────────────────────┘
▲ ▲ ▲
│ │ │
┌────┴────┐ ┌────┴────┐ ┌────┴────┐
│ default │ │ jobs │ │analytics│
│ 30 conn │ │ 15 conn │ │ 5 conn │
│ 30s TO │ │ 300s TO │ │ 600s TO │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Queries │ │ Jobs │ │ Reports │
│Mutations│ │ │ │ │
└─────────┘ └─────────┘ └─────────┘

A runaway batch job cannot exhaust connections reserved for user requests.

Cluster Discovery

Nodes discover each other through PostgreSQL:

SELECT * FROM forge_nodes WHERE last_heartbeat > NOW() - INTERVAL '15s'

Nodes insert their address on startup, update on heartbeat, and get cleaned up when dead_threshold passes. Additional infrastructure is not required.

Node Role Enforcement

Roles determine which subsystems start:

if config.node.roles.contains(&NodeRole::Gateway) {
start_http_server(&config.gateway).await?;
}
if config.node.roles.contains(&NodeRole::Worker) {
start_job_worker(&config.worker).await?;
}
if config.node.roles.contains(&NodeRole::Scheduler) {
start_cron_scheduler().await?;
}

Omitted roles mean those subsystems never start. A Worker-only node never binds the HTTP port. A Gateway-only node never polls the job queue.