Manual Scaling¶
Manual scaling lets you change two independent things on a process:
- Replicas — how many pods serve the workload (horizontal).
- Resources — CPU/memory request and limit per pod (vertical).
Replicas go through POST /scale; resources go through
PUT /resources. The dashboard's Scale page combines both in one UI;
the CLI keeps the existing paas scale web=3 for replicas and adds
web=L syntax for size.
How it works¶
sequenceDiagram
participant U as Operator
participant CP as Control Plane
participant DB as Postgres (paas_tenant_quotas)
participant K as Kubernetes
U->>CP: PUT /v1/apps/{app}/resources { process_type, size }
CP->>CP: size_to_resources("L") → cpu=2, mem=2Gi
CP->>DB: SELECT quota_cpu, quota_memory_mi WHERE tenant_id = …
alt requested ≤ quota
CP->>K: patch Deployment (resources.requests/limits)
K->>K: rolling update (RollingUpdate strategy)
CP-->>U: 200 OK
else requested > quota
CP-->>U: 422 quota_exceeded
end
T-shirt sizes¶
Pick a size and let the platform pick CPU + memory:
| Size | CPU req | Memory req | CPU limit | Memory limit |
|---|---|---|---|---|
| Free | 0.25 | 256 MiB | 0.5 | 512 MiB |
| S | 0.5 | 512 MiB | 1 | 1 GiB |
| M | 1 | 1 GiB | 2 | 2 GiB |
| L | 2 | 2 GiB | 4 | 4 GiB |
| XL | 4 | 4 GiB | 8 | 8 GiB |
| 2XL | 8 | 8 GiB | 16 | 16 GiB |
An unknown size silently falls back to Free so a typo can't push a pod into a bucket the cluster can't satisfy.
You can also bypass the catalogue and pass cpu / memory directly:
curl -X PUT https://runtime.di2amp.com/api/v1/apps/$APP/resources \
-H "Authorization: Bearer $TOKEN" \
-d '{"process_type": "web", "cpu": "750m", "memory": "768Mi"}'
When size is set, it takes precedence over the per-field overrides.
Without an explicit limit, the request value doubles to a safe limit
(QoS stays Burstable, but a runaway pod can't starve its neighbours).
Quota policy¶
Every tenant has a CPU and a memory cap stored in
paas_tenant_quotas. Defaults — applied when the tenant has no row
yet — are 4 CPU / 4096 MiB, which fits any size up to L
inclusive. Pushing past the cap returns:
with HTTP 422 Unprocessable Entity. The dashboard surfaces this as a toast: "Quota exceeded — upgrade your plan or pick a smaller size."
To raise a tenant's cap (operator-only, billing flow lands later):
INSERT INTO paas_tenant_quotas (tenant_id, quota_cpu, quota_memory_mi)
VALUES ('acme', 16, 32768)
ON CONFLICT (tenant_id) DO UPDATE
SET quota_cpu = EXCLUDED.quota_cpu,
quota_memory_mi = EXCLUDED.quota_memory_mi,
updated_at = NOW();
UI¶
Open /apps/{id}/scale. Each process gets its own card:
- Status line:
2/2 ready · 0.25 CPU · 256Mi - Size dropdown: Free / S / M / L / XL / 2XL
- Apply button (disabled if the picked size matches the current).
Apply triggers PUT /resources; on success the page refreshes the
scale query and toasts "Resources updated". On quota_exceeded
(422), the toast tells the operator how to recover.
CLI¶
Replicas — same as before:
Size (cycle 2 helper, full CLI wire-up lands in cycle 3):
web=Lg (typo) returns "invalid value: Lg (expected integer or one
of Free/S/M/L/XL/2XL)" immediately — no silent coerce.
API endpoints¶
| Verb | Path | Body | Notes |
|---|---|---|---|
| GET | /v1/apps/{id}/scale |
— | Returns one row per process: {type, replicas, ready, size, cpu, memory} |
| POST | /v1/apps/{id}/scale |
{process_type, replicas} |
Replicas only (existing) |
| PUT | /v1/apps/{id}/resources |
{process_type, size?, cpu?, memory?} |
Resources, runs quota admission |
Related¶
- Rolling Deploy — the engine that applies the resource patch with zero downtime
- Blueprint paas.toml — declare default
resources up-front in
[resources]and[scaling]