-
Notifications
You must be signed in to change notification settings - Fork 455
Open
Labels
performancePerformance related itemsPerformance related items
Milestone
Description
Add Envoy Gateway with Optional Caching for Helm Chart
Summary
Add Envoy Gateway as an alternative ingress option for the MCP Gateway Helm chart, supporting the Kubernetes Gateway API standard with optional HTTP caching. This provides HTTP/2 upstream support, advanced traffic management, and cloud-native observability.
Motivation
Current ingress-nginx limitations in Kubernetes:
- No HTTP/2 to backend pods
- Limited observability (requires separate ServiceMonitor)
- No native Gateway API support (requires separate controller)
Envoy Gateway advantages:
- Reference implementation of Kubernetes Gateway API
- Native HTTP/2 and gRPC to upstream pods
- Built-in Prometheus metrics
- Advanced traffic policies (rate limiting, circuit breaking, retries)
- Dynamic configuration without pod restarts
- Better WebSocket and SSE support
Implementation
1. Add Envoy Gateway CRDs and Controller
Add to charts/mcp-stack/templates/:
gateway-class.yaml:
{{- if .Values.envoyGateway.enabled }}
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: {{ include "mcp-stack.fullname" . }}-envoy
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
parametersRef:
group: gateway.envoyproxy.io
kind: EnvoyProxy
name: {{ include "mcp-stack.fullname" . }}-proxy-config
namespace: {{ .Release.Namespace }}
{{- end }}gateway.yaml:
{{- if .Values.envoyGateway.enabled }}
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: {{ include "mcp-stack.fullname" . }}
namespace: {{ .Release.Namespace }}
spec:
gatewayClassName: {{ include "mcp-stack.fullname" . }}-envoy
listeners:
- name: http
protocol: HTTP
port: 80
allowedRoutes:
namespaces:
from: Same
{{- if .Values.envoyGateway.tls.enabled }}
- name: https
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- kind: Secret
name: {{ .Values.envoyGateway.tls.secretName }}
allowedRoutes:
namespaces:
from: Same
{{- end }}
{{- end }}httproute.yaml:
{{- if .Values.envoyGateway.enabled }}
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: {{ include "mcp-stack.fullname" . }}-routes
namespace: {{ .Release.Namespace }}
spec:
parentRefs:
- name: {{ include "mcp-stack.fullname" . }}
hostnames:
- {{ .Values.envoyGateway.hostname | default "*" }}
rules:
# Health check - no timeouts
- matches:
- path:
type: Exact
value: /health
backendRefs:
- name: {{ include "mcp-stack.fullname" . }}-gateway
port: 4444
timeouts:
request: 5s
# SSE endpoints - extended timeouts
- matches:
- path:
type: RegularExpression
value: "^/servers/.*/sse$"
backendRefs:
- name: {{ include "mcp-stack.fullname" . }}-gateway
port: 4444
timeouts:
request: 3600s
# WebSocket endpoints
- matches:
- path:
type: RegularExpression
value: "^/servers/.*/ws$"
backendRefs:
- name: {{ include "mcp-stack.fullname" . }}-gateway
port: 4444
timeouts:
request: 3600s
# All other routes
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: {{ include "mcp-stack.fullname" . }}-gateway
port: 4444
timeouts:
request: 60s
{{- end }}envoy-proxy-config.yaml:
{{- if .Values.envoyGateway.enabled }}
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: {{ include "mcp-stack.fullname" . }}-proxy-config
namespace: {{ .Release.Namespace }}
spec:
provider:
type: Kubernetes
kubernetes:
envoyDeployment:
replicas: {{ .Values.envoyGateway.replicas | default 2 }}
container:
resources:
limits:
cpu: {{ .Values.envoyGateway.resources.limits.cpu | default "2" }}
memory: {{ .Values.envoyGateway.resources.limits.memory | default "1Gi" }}
requests:
cpu: {{ .Values.envoyGateway.resources.requests.cpu | default "500m" }}
memory: {{ .Values.envoyGateway.resources.requests.memory | default "512Mi" }}
telemetry:
metrics:
prometheus: {}
sinks:
- type: OpenTelemetry
openTelemetry:
host: {{ .Values.envoyGateway.otel.host | default "otel-collector" }}
port: {{ .Values.envoyGateway.otel.port | default 4317 }}
{{- end }}backend-traffic-policy.yaml (rate limiting, circuit breaking):
{{- if and .Values.envoyGateway.enabled .Values.envoyGateway.trafficPolicy.enabled }}
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: {{ include "mcp-stack.fullname" . }}-traffic-policy
namespace: {{ .Release.Namespace }}
spec:
targetRef:
group: gateway.networking.k8s.io
kind: HTTPRoute
name: {{ include "mcp-stack.fullname" . }}-routes
rateLimit:
type: Local
local:
rules:
- limit:
requests: {{ .Values.envoyGateway.rateLimit.requestsPerSecond | default 3000 }}
unit: Second
circuitBreaker:
maxConnections: {{ .Values.envoyGateway.circuitBreaker.maxConnections | default 4096 }}
maxPendingRequests: {{ .Values.envoyGateway.circuitBreaker.maxPendingRequests | default 4096 }}
maxRequests: {{ .Values.envoyGateway.circuitBreaker.maxRequests | default 4096 }}
retry:
numRetries: 2
retryOn:
triggers:
- "5xx"
- reset
- connect-failure
{{- end }}2. Add Optional Caching with External Service
varnish-deployment.yaml (optional caching layer):
{{- if .Values.envoyGateway.cache.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "mcp-stack.fullname" . }}-varnish
namespace: {{ .Release.Namespace }}
spec:
replicas: {{ .Values.envoyGateway.cache.replicas | default 2 }}
selector:
matchLabels:
app: {{ include "mcp-stack.fullname" . }}-varnish
template:
metadata:
labels:
app: {{ include "mcp-stack.fullname" . }}-varnish
spec:
containers:
- name: varnish
image: {{ .Values.envoyGateway.cache.image | default "varnish:7.6" }}
args:
- "-s"
- "malloc,{{ .Values.envoyGateway.cache.memorySize | default "1G" }}"
- "-a"
- ":80"
ports:
- containerPort: 80
volumeMounts:
- name: vcl-config
mountPath: /etc/varnish/default.vcl
subPath: default.vcl
resources:
limits:
cpu: {{ .Values.envoyGateway.cache.resources.limits.cpu | default "2" }}
memory: {{ .Values.envoyGateway.cache.resources.limits.memory | default "2Gi" }}
requests:
cpu: {{ .Values.envoyGateway.cache.resources.requests.cpu | default "500m" }}
memory: {{ .Values.envoyGateway.cache.resources.requests.memory | default "1Gi" }}
volumes:
- name: vcl-config
configMap:
name: {{ include "mcp-stack.fullname" . }}-varnish-vcl
---
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "mcp-stack.fullname" . }}-varnish-vcl
namespace: {{ .Release.Namespace }}
data:
default.vcl: |
vcl 4.1;
backend default {
.host = "{{ include "mcp-stack.fullname" . }}-gateway";
.port = "4444";
.connect_timeout = 5s;
.first_byte_timeout = 60s;
.between_bytes_timeout = 60s;
}
sub vcl_recv {
# Only cache GET and HEAD
if (req.method != "GET" && req.method != "HEAD") {
return (pass);
}
# Don't cache admin, SSE, WebSocket
if (req.url ~ "^/admin" || req.url ~ "/sse$" || req.url ~ "/ws$") {
return (pass);
}
# Cache static assets
if (req.url ~ "\.(css|js|png|jpg|jpeg|gif|ico|svg|woff|woff2)$") {
return (hash);
}
# Cache API read endpoints for 5 minutes
if (req.url ~ "^/(tools|servers|gateways|resources|prompts|version)$") {
return (hash);
}
return (pass);
}
sub vcl_backend_response {
# Static assets: 30 days
if (bereq.url ~ "\.(css|js|png|jpg|jpeg|gif|ico|svg|woff|woff2)$") {
set beresp.ttl = 30d;
}
# API endpoints: 5 minutes
elsif (bereq.url ~ "^/(tools|servers|gateways|resources|prompts|version)$") {
set beresp.ttl = 5m;
}
}
sub vcl_deliver {
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
{{- end }}3. Update values.yaml
Add to charts/mcp-stack/values.yaml:
# Envoy Gateway configuration (alternative to ingress-nginx)
envoyGateway:
enabled: false # Set to true to use Envoy Gateway instead of ingress-nginx
# Hostname for the gateway (use "*" for any)
hostname: "*"
# Number of Envoy proxy replicas
replicas: 2
# Resource limits
resources:
limits:
cpu: "2"
memory: "1Gi"
requests:
cpu: "500m"
memory: "512Mi"
# TLS configuration
tls:
enabled: false
secretName: mcp-gateway-tls
# Traffic policy (rate limiting, circuit breaking)
trafficPolicy:
enabled: true
# Rate limiting
rateLimit:
requestsPerSecond: 3000
# Circuit breaker
circuitBreaker:
maxConnections: 4096
maxPendingRequests: 4096
maxRequests: 4096
# OpenTelemetry integration
otel:
enabled: false
host: otel-collector
port: 4317
# Optional: Varnish caching layer
cache:
enabled: false
replicas: 2
image: varnish:7.6
memorySize: "1G"
resources:
limits:
cpu: "2"
memory: "2Gi"
requests:
cpu: "500m"
memory: "1Gi"4. Add ServiceMonitor for Prometheus
servicemonitor.yaml:
{{- if and .Values.envoyGateway.enabled .Values.monitoring.enabled }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: {{ include "mcp-stack.fullname" . }}-envoy
namespace: {{ .Release.Namespace }}
spec:
selector:
matchLabels:
gateway.envoyproxy.io/owning-gateway-name: {{ include "mcp-stack.fullname" . }}
endpoints:
- port: metrics
interval: 30s
path: /stats/prometheus
{{- end }}Prerequisites
Envoy Gateway must be installed in the cluster:
# Install Envoy Gateway
helm install eg oci://docker.io/envoyproxy/gateway-helm \
--version v1.2.0 \
-n envoy-gateway-system --create-namespace
# Verify CRDs
kubectl get crd | grep gatewayTasks
- Create Gateway API resource templates (GatewayClass, Gateway, HTTPRoute)
- Create EnvoyProxy configuration template
- Create BackendTrafficPolicy template (rate limiting, circuit breaking)
- Add optional Varnish caching layer templates
- Update values.yaml with envoyGateway configuration section
- Add ServiceMonitor for Prometheus integration
- Update Chart.yaml dependencies if needed
- Create documentation for Envoy Gateway setup
- Add example values for common configurations
- Test with HTTP/2 enabled on gateway pods
- Test SSE and WebSocket passthrough
- Test rate limiting behavior
- Test circuit breaker behavior
- Benchmark performance comparison vs ingress-nginx
Acceptance Criteria
- Envoy Gateway deploys when
envoyGateway.enabled=true - Traffic routes correctly to gateway pods via HTTPRoute
- HTTP/2 works between Envoy and gateway pods
- SSE and WebSocket endpoints work with extended timeouts
- Rate limiting returns 429 when exceeded
- Circuit breaker opens on backend failures
- Prometheus metrics available via ServiceMonitor
- Optional Varnish cache layer works correctly
- TLS termination works when enabled
Migration Path
To migrate from ingress-nginx to Envoy Gateway:
# Disable ingress-nginx
ingress:
enabled: false
# Enable Envoy Gateway
envoyGateway:
enabled: true
hostname: "mcp.example.com"
tls:
enabled: true
secretName: mcp-gateway-tlsReferences
Metadata
Metadata
Assignees
Labels
performancePerformance related itemsPerformance related items