Skip to content

OpenSearch Addon

paas ships a managed OpenSearch addon backed by the OpenSearch Kubernetes Operator (opensearch.opster.io/v1). Each app gets a dedicated OpenSearchCluster CR, a generated admin password, and an OPENSEARCH_URL Secret pre-mounted at deploy time.

At a glance

Capability Default Where to configure
Engine OpenSearch 2.13.0 version in the create payload
Cluster CR opensearch.opster.io/v1::OpenSearchCluster derived from app_id
HTTP port 9200 (fixed)
Plans free / standard / pro plan in the create payload
RAM/heap ratio 2:1 (container memory = 2 × JVM heap) pinned by OpensearchPlanConfig
Connection user admin (cycle 2 default — scoped app_user cycle 3+) not configurable
Connection URL injected as OPENSEARCH_URL env var via os-{app_id}-url Secret

Plans (cycle 2 corrected ratio 2:1)

Plan Nodes JVM Heap RAM (container) Storage
free 1 256Mi 512Mi 5Gi
standard 1 1Gi 2Gi 20Gi
pro 3 2Gi 4Gi 50Gi

The mapping lives in paas_database::opensearch_provisioner::opensearch_plan_config. Cycle 1 shipped RAM == heap (1:1 ratio, OOM-prone under indexing load) — cycle 2 corrected to the documented 2:1 ratio so the off-heap budget (Lucene file cache + direct memory + native libs) doesn't crash the pod under load.

RAM/heap ratio is 2:1 — never less

Container memory must be 2 × JVM heap. The JVM heap holds the young+old generations; the off-heap region needs roughly the same budget for Lucene's file cache, JVM direct memory, and native libraries. Sizing them equal (1:1) starves off-heap and the pod OOM-kills under any real indexing load. The arithmetic is pinned by opensearch_plan_*_memory_is_2x_heap cargo tests.

Free plan: heap 256Mi is below recommended floor

free ships 256Mi heap (with 512Mi container RAM at the 2:1 ratio). The OpenSearch operator's recommended floor is 1Gi heap. Under any production indexing load free will OOM — recommend_plan_upgrade("free", "opensearch") returns Some("standard") and the create endpoint logs the upgrade hint via tracing::info!. The dashboard surfaces it via the polling loop.

Lifecycle

flowchart LR
    A["POST /v1/apps/{id}/addons<br/>{type:'opensearch', plan, version}"]
    A --> B["addons.rs::create_addon_generic"]
    B --> C["credentials::generate_password()"]
    C --> D["ensure_opensearch_url_secret<br/>os-{app}-url Secret"]
    D --> E["ensure_opensearch_cluster<br/>OpenSearchCluster CR Patch::Apply"]
    E --> F["OpenSearch Operator<br/>provisions StatefulSet + Service"]
    F --> G["os-{app}.{ns} Service<br/>HTTP 9200"]
    H["dashboard polls<br/>poll_addon_status opensearch branch"]
    H --> I["get_opensearch_cluster<br/>parse_opensearch_status"]
    I -- "green / yellow" --> J["app_addons.status = 'Ready'<br/>ready_at stamped"]
    I -- "red" --> K["status = 'Failed'"]
    I -- "_/missing" --> L["status = 'Creating'"]

OPENSEARCH_URL format

http://admin:{generated_password}@os-{app_id}.paas-apps:9200

Materialised in the K8s Secret named os-{app_id}-url (key: OPENSEARCH_URL). The dashboard's paas config:set integration mounts it into the app pod's environment automatically.

Components:

  • http:// — cycle 2 doesn't ship TLS yet (SecurityPlugin disabled by default in this POC). Cycle 3+ switches to https:// via cert-manager-issued certs.
  • admin — Operator-managed default user. Cycle 3+ will provision a scoped app_user via the SecurityPlugin REST API; the URL shape stays the same, just with a new password (no breaking change to client code).
  • {generated_password}paas_database::credentials::generate_password() emits a hex uuid v4 derivative with ≥ 60 chars of entropy. Never hardcoded.
  • os-{app_id}general.serviceName set in the CR spec. The Operator emits a Service of the same name.
  • paas-apps — the namespace where every tenant addon lives.
  • 9200 — OpenSearch's standard HTTP wire port.

Status lifecycle

Operator .status.health app_addons.status Reason
green Ready All shards active, full replicas
yellow Ready Primary shards OK, some replicas unassigned. Queryable — better to let tenants connect and see the degraded indicator than block on a transient yellow during a node restart.
red Failed A primary shard is missing — cluster is not queryable
(missing or unknown) Creating Operator hasn't populated .status.health yet (defensive — no 5xx surface)

Versions

2.13 (or 2.13.0) is the accepted value for version. The default is OPENSEARCH_DEFAULT_VERSION = "2.13.0" — anything else (or None) falls back to the default so a malformed client payload can't ship a poisoned version to the operator.

Tests de validation (DoD)

APP_ID=...                        # your app's UUID
TOKEN=$(paas auth print-token)
PAAS_URL=https://runtime.di2amp.com

# 1. Create the OpenSearch addon
curl -sk -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -H "content-type: application/json" \
  -d '{"addon_type":"opensearch","plan":"standard","version":"2.13"}' \
  $PAAS_URL/v1/apps/$APP_ID/addons | jq .

# 2. Poll until Ready (≤ 5 min after operator install)
for i in 1 2 3 4 5 6 7 8 9 10; do
  HEALTH=$(kubectl -n paas-apps get opensearchcluster os-$APP_ID -o jsonpath='{.status.health}' 2>/dev/null)
  echo "tick $i: $HEALTH"
  [[ "$HEALTH" == "green" || "$HEALTH" == "yellow" ]] && break
  sleep 30
done

# 3. Verify OPENSEARCH_URL is mounted on the app pod
kubectl -n paas-apps exec deployment/app-$APP_ID-web -- env | grep OPENSEARCH_URL
# Expect: OPENSEARCH_URL=http://admin:...@os-{APP_ID}.paas-apps:9200

# 4. Smoke the cluster from inside the cluster
URL=$(kubectl -n paas-apps get secret os-$APP_ID-url -o jsonpath='{.data.OPENSEARCH_URL}' | base64 -d)
kubectl -n paas-apps run os-cli --rm -it --image=curlimages/curl --restart=Never -- \
  curl -sk "$URL/_cluster/health" | jq .

# 5. Index a doc, search it, delete the index
kubectl -n paas-apps exec deployment/app-$APP_ID-web -- sh -c '
  curl -sk -X POST "$OPENSEARCH_URL/test/_doc?refresh=true" -H "content-type: application/json" -d "{\"hello\":\"world\"}";
  curl -sk "$OPENSEARCH_URL/test/_search" | jq .;
  curl -sk -X DELETE "$OPENSEARCH_URL/test";
'

Implementation pointers

Concern File
Plan → resources mapping (ratio 2:1) crates/database/src/opensearch_provisioner.rs::opensearch_plan_config
RFC-1123 cluster name crates/database/src/opensearch_provisioner.rs::opensearch_cluster_name (cross-addon sanitize_dns_label)
OpenSearchCluster spec builder crates/database/src/opensearch_provisioner.rs::build_opensearchcluster_spec
OPENSEARCH_URL formatter crates/database/src/opensearch_provisioner.rs::opensearch_url
Secret materialiser crates/database/src/opensearch_provisioner.rs::ensure_opensearch_url_secret
CR get-or-create crates/database/src/opensearch_provisioner.rs::ensure_opensearch_cluster
Status projection (green+yellow → Ready) crates/database/src/opensearch_provisioner.rs::parse_opensearch_status
Cross-addon plan upgrade hint crates/database/src/opensearch_provisioner.rs::recommend_plan_upgrade
Route handler crates/control-plane/src/routes/addons.rs::create_addon_generic (opensearch branch)
Polling loop crates/control-plane/src/routes/addons.rs::poll_addon_status (opensearch branch)

Cluster pre-requisites

The OpenSearch Operator must be installed before the addon can be provisioned. paas's hot-fix runbook (the helm repo opster.github.io returns 404 — direct kubectl apply against the GitHub release works):

# Find the latest release tag
LATEST=$(curl -s https://api.github.com/repos/opensearch-project/opensearch-k8s-operator/releases/latest \
    | grep tag_name | cut -d'"' -f4)

# Apply the operator manifest cluster-wide
kubectl apply -f "https://github.com/opensearch-project/opensearch-k8s-operator/releases/download/${LATEST}/opensearch-operator.yaml"

# Sanity:
kubectl get crd | grep opensearch.opster.io
# Expect: opensearchclusters.opensearch.opster.io
kubectl -n opensearch-operator-system get pods
# Expect: opensearch-operator-controller-manager-... 2/2 Running

Documented in bilans/HOTFIXES.md so the next operator running the runbook gets the correct fallback first try.

Limits (cycle 2)

  • SecurityPlugin — disabled by default (POC). The admin user is operator-managed. Cycle 3+ provisions a scoped app_user via the SecurityPlugin REST API.
  • TLS — cycle 2 ships HTTP only. cert-manager-issued certs
  • https:// URL switch in cycle 3+.
  • Index management (sub-brique 42d) — out of scope for paas (cahier-confirmed). Tenants manage their own indices via the REST API.
  • OpenSearch Dashboards UI — out of scope cycles 1-2.
  • Snapshot/backup — out of scope.
  • Add-ons — umbrella addon flow that mysql / postgres / valkey / opensearch / clickhouse all hang off.
  • MySQL Addon — sister addon backed by Oracle MySQL Operator.
  • ClickHouse Addon — sister addon backed by Altinity ClickHouse Operator.