Horizontal scaling (since v2.0)¶
The gateway holds short-lived state — MCP Streamable HTTP session metadata, OIDC flow state (PKCE verifier + nonce + return target), DCR-registered client metadata, and (with Phase F10) federation catalogue cache. Until v2.0 all of this lived in process memory, so running more than one replica required sticky-session ingress and risked silently losing state on pod rolls.
v2.0 introduces an external session store the gateway can point at — today an in-memory map (the default, preserves the pre-F8 behaviour) or a Redis-backed store for true multi-replica HA.
When you need it¶
| Deployment shape | Configuration |
|---|---|
| Single replica, demo / dev | Defaults. Nothing to do. |
| Single replica, prod | Defaults. Pod recreated on rolling release → momentary downtime expected. |
| Multi-replica, prod | Enable the Redis store. Otherwise OIDC callbacks land on a replica that doesn't know about the flow state, and Streamable HTTP sessions stick to whichever pod served them. |
Enable the Redis store¶
Env (when running standalone)¶
bash
export OMCP_REDIS_URL=redis://redis.observability.svc.cluster.local:6379
export OMCP_REDIS_KEY_PREFIX=omcp: # optional; default omcp:
Helm¶
The chart does not ship a Redis subchart by design (operators usually have their own managed Redis or in-cluster instance). Set the URL on your existing Redis:
yaml
replicaCount: 3
strategy:
type: RollingUpdate
redis:
enabled: true
url: "redis://redis.shared.svc.cluster.local:6379"
keyPrefix: "omcp-prod:"
replicaCount: 3 + RollingUpdate give continuous availability; the
chart automatically renders a PodDisruptionBudget with
minAvailable: 1 (override via podDisruptionBudget.minAvailable)
whenever replicaCount > 1.
Without the shared store¶
If you cannot run Redis yet but still need >1 replica (e.g. for
horizontal request throughput on stateless /api/* paths), enable
sticky-session ingress so each session lands consistently on the
same pod:
yaml
replicaCount: 3
ingressSticky:
enabled: true
This renders nginx-ingress annotations
(affinity: cookie, session-cookie-name: omcp-affinity). Other
ingress controllers need an equivalent override via
ingress.annotations.
This is a fallback — OIDC login attempts can still fail when the callback hits a different pod than the login start. The Redis store is the production answer.
Failure modes¶
- Redis unreachable at boot → the gateway logs a warning and falls back to in-memory. It boots. Multi-replica deployments lose cross-replica coherence until Redis returns; restart the pod to pick up the live store once Redis is healthy.
- Redis flaps mid-run → each
get/setreturns the driver's error, which the calling subsystem (OIDC / DCR / federation) handles by treating the entry as missing and re-issuing the underlying request. No crash. - Replicas with different
OMCP_REDIS_KEY_PREFIX→ each replica effectively reads its own keyspace. Tie the prefix to the Helm release name so this can't drift.
Capacity guidance¶
- Key count: O(active sessions + active OIDC flows + DCR registrations + federation cache). Production realistic upper bound: ~10k. A 1 GB Redis is way over-provisioned; share with whatever other tooling is in the cluster.
- TTLs: OIDC flow state writes with a 10-minute TTL (matching
the spec's
acceptable clock skew + login latencywindow). MCP Streamable HTTP sessions use no TTL (Streamable HTTP itself owns liveness via the session-id header). DCR registrations are permanent (deleted explicitly on revoke).
Transport session map (Q11 / v3.1)¶
Until v3.1, the MCP Streamable HTTP sessions held in a process-local
Map<sessionId, Transport> were the only per-session state — no
cross-replica visibility. With replicaCount > 1 and no sticky
ingress, a request for sessionId S minted on replica A could land
on replica B; B silently created a new transport and the client
ended up alternating between two transports for the same logical
session.
v3.1 promotes the per-session metadata (owner-replica id +
last-active + virtual-server product slug) onto the same
SessionStore backend the OIDC + DCR state already uses
(mcp-server/src/transport/transportSessionMap.ts). Pick:
InMemoryTransportSessionMap— identical to pre-v3.1 behaviour, used when no Redis is configured.SessionStoreBackedTransportSessionMap— wraps the existingRedisSessionStore, every replica writes its own session metadata to the shared store. A replica that receives a request for a locally-unknown sessionId consults the shared map and refuses with410 Goneif the session was minted elsewhere; the client retries and load-balancer rehashing eventually finds the owner.
Transport objects themselves stay local — they hold open HTTP
response handles and aren't serialisable. The TTL on each metadata
entry (default 30 minutes matching SESSION_TTL_MS) reaps stale
entries when a replica disappears without graceful shutdown.
With this in place sticky ingress is a performance optimisation, not a correctness requirement.
Migration from in-memory single replica¶
Single-replica deployments without OMCP_REDIS_URL continue working
identically — F8 changes nothing for them. A best-effort serialize on
SIGTERM (write open sessions to OMCP_STATE_DIR/sessions.json,
re-read on boot) is on the roadmap so a single-replica rolling
restart doesn't drop in-flight sessions; until it lands, plan for the
brief window of session loss that a single-replica Recreate
strategy implies.