Scale Your App¶

PaaS Runtime supports manual scaling, autoscaling (HPA), scale-to-zero, and bursting. Each app declares process types in Procfile (or paas.toml), and you scale them independently.

Manual scaling¶

Scale web to 3 replicas, worker to 2:

paas ps:scale web=3 worker=2

Output:

✓ web:    1 → 3 replicas (rolling, 30s)
✓ worker: 1 → 2 replicas (rolling, 30s)

paas ps:scale triggers a Kubernetes Deployment update. New replicas pass health-check before old ones terminate (zero-downtime).

Autoscaling (HPA)¶

Enable Horizontal Pod Autoscaler in paas.toml:

[[processes]]
name = "web"
command = "node server.js"
replicas_min = 2
replicas_max = 10
hpa_target_cpu = 70

[[processes]]
name = "worker"
command = "node worker.js"
replicas_min = 1
replicas_max = 5
hpa_target_cpu = 80

PaaS creates a HorizontalPodAutoscaler (autoscaling/v2). On every push, the HPA spec is updated.

Inspect:

$ paas ps:hpa
PROCESS  MIN  MAX  CURRENT  TARGET CPU  CURRENT CPU
web      2    10   3        70%         42%
worker   1    5    1        80%         15%

Scale-to-zero (free plan)¶

Free-tier apps sleep after 30 minutes of zero traffic. The first request after sleep wakes the app (cold start ~5s for Node, ~30s for JVM).

To disable scale-to-zero on the free plan, upgrade:

paas plan:change starter

Starter and above keep at least replicas_min replicas hot.

Vertical scaling (CPU/memory)¶

Set per-process resource limits in paas.toml:

[[processes]]
name = "web"
cpu = "500m"        # 0.5 CPU
memory = "256Mi"

[[processes]]
name = "worker"
cpu = "2"           # 2 CPUs
memory = "2Gi"

PaaS sets both requests and limits to these values (deterministic scheduling, no overcommit).

Burst capacity¶

For traffic spikes beyond replicas_max, use paas burst:

paas burst web=20 --duration 1h

Output:

✓ web: bursting to 20 replicas for 1h
  At 14:42 UTC, web will scale back to HPA bounds (2-10)

Burst replicas land on dedicated burst nodes (separate node pool, no eviction). Useful for known traffic spikes (Black Friday, feature launches).

Cron processes¶

Scheduled jobs are scaled independently — each cron entry creates a Kubernetes CronJob:

[cron]
nightly_report = { schedule = "0 2 * * *", command = "node scripts/report.js" }
hourly_sync = { schedule = "0 * * * *", command = "node scripts/sync.js" }

Each invocation runs in its own pod with the same image + env as web (no scaling needed — Kubernetes handles parallelism).