Skip to content

[PERFORMANCE]: Add Envoy Gateway with Optional Caching for Helm Chart #1864

@crivetimihai

Description

@crivetimihai

Add Envoy Gateway with Optional Caching for Helm Chart

Summary

Add Envoy Gateway as an alternative ingress option for the MCP Gateway Helm chart, supporting the Kubernetes Gateway API standard with optional HTTP caching. This provides HTTP/2 upstream support, advanced traffic management, and cloud-native observability.

Motivation

Current ingress-nginx limitations in Kubernetes:

  • No HTTP/2 to backend pods
  • Limited observability (requires separate ServiceMonitor)
  • No native Gateway API support (requires separate controller)

Envoy Gateway advantages:

  • Reference implementation of Kubernetes Gateway API
  • Native HTTP/2 and gRPC to upstream pods
  • Built-in Prometheus metrics
  • Advanced traffic policies (rate limiting, circuit breaking, retries)
  • Dynamic configuration without pod restarts
  • Better WebSocket and SSE support

Implementation

1. Add Envoy Gateway CRDs and Controller

Add to charts/mcp-stack/templates/:

gateway-class.yaml:

{{- if .Values.envoyGateway.enabled }}
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: {{ include "mcp-stack.fullname" . }}-envoy
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: {{ include "mcp-stack.fullname" . }}-proxy-config
    namespace: {{ .Release.Namespace }}
{{- end }}

gateway.yaml:

{{- if .Values.envoyGateway.enabled }}
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: {{ include "mcp-stack.fullname" . }}
  namespace: {{ .Release.Namespace }}
spec:
  gatewayClassName: {{ include "mcp-stack.fullname" . }}-envoy
  listeners:
    - name: http
      protocol: HTTP
      port: 80
      allowedRoutes:
        namespaces:
          from: Same
    {{- if .Values.envoyGateway.tls.enabled }}
    - name: https
      protocol: HTTPS
      port: 443
      tls:
        mode: Terminate
        certificateRefs:
          - kind: Secret
            name: {{ .Values.envoyGateway.tls.secretName }}
      allowedRoutes:
        namespaces:
          from: Same
    {{- end }}
{{- end }}

httproute.yaml:

{{- if .Values.envoyGateway.enabled }}
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: {{ include "mcp-stack.fullname" . }}-routes
  namespace: {{ .Release.Namespace }}
spec:
  parentRefs:
    - name: {{ include "mcp-stack.fullname" . }}
  hostnames:
    - {{ .Values.envoyGateway.hostname | default "*" }}
  rules:
    # Health check - no timeouts
    - matches:
        - path:
            type: Exact
            value: /health
      backendRefs:
        - name: {{ include "mcp-stack.fullname" . }}-gateway
          port: 4444
      timeouts:
        request: 5s

    # SSE endpoints - extended timeouts
    - matches:
        - path:
            type: RegularExpression
            value: "^/servers/.*/sse$"
      backendRefs:
        - name: {{ include "mcp-stack.fullname" . }}-gateway
          port: 4444
      timeouts:
        request: 3600s

    # WebSocket endpoints
    - matches:
        - path:
            type: RegularExpression
            value: "^/servers/.*/ws$"
      backendRefs:
        - name: {{ include "mcp-stack.fullname" . }}-gateway
          port: 4444
      timeouts:
        request: 3600s

    # All other routes
    - matches:
        - path:
            type: PathPrefix
            value: /
      backendRefs:
        - name: {{ include "mcp-stack.fullname" . }}-gateway
          port: 4444
      timeouts:
        request: 60s
{{- end }}

envoy-proxy-config.yaml:

{{- if .Values.envoyGateway.enabled }}
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: {{ include "mcp-stack.fullname" . }}-proxy-config
  namespace: {{ .Release.Namespace }}
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyDeployment:
        replicas: {{ .Values.envoyGateway.replicas | default 2 }}
        container:
          resources:
            limits:
              cpu: {{ .Values.envoyGateway.resources.limits.cpu | default "2" }}
              memory: {{ .Values.envoyGateway.resources.limits.memory | default "1Gi" }}
            requests:
              cpu: {{ .Values.envoyGateway.resources.requests.cpu | default "500m" }}
              memory: {{ .Values.envoyGateway.resources.requests.memory | default "512Mi" }}
  telemetry:
    metrics:
      prometheus: {}
      sinks:
        - type: OpenTelemetry
          openTelemetry:
            host: {{ .Values.envoyGateway.otel.host | default "otel-collector" }}
            port: {{ .Values.envoyGateway.otel.port | default 4317 }}
{{- end }}

backend-traffic-policy.yaml (rate limiting, circuit breaking):

{{- if and .Values.envoyGateway.enabled .Values.envoyGateway.trafficPolicy.enabled }}
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: {{ include "mcp-stack.fullname" . }}-traffic-policy
  namespace: {{ .Release.Namespace }}
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: {{ include "mcp-stack.fullname" . }}-routes
  rateLimit:
    type: Local
    local:
      rules:
        - limit:
            requests: {{ .Values.envoyGateway.rateLimit.requestsPerSecond | default 3000 }}
            unit: Second
  circuitBreaker:
    maxConnections: {{ .Values.envoyGateway.circuitBreaker.maxConnections | default 4096 }}
    maxPendingRequests: {{ .Values.envoyGateway.circuitBreaker.maxPendingRequests | default 4096 }}
    maxRequests: {{ .Values.envoyGateway.circuitBreaker.maxRequests | default 4096 }}
  retry:
    numRetries: 2
    retryOn:
      triggers:
        - "5xx"
        - reset
        - connect-failure
{{- end }}

2. Add Optional Caching with External Service

varnish-deployment.yaml (optional caching layer):

{{- if .Values.envoyGateway.cache.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "mcp-stack.fullname" . }}-varnish
  namespace: {{ .Release.Namespace }}
spec:
  replicas: {{ .Values.envoyGateway.cache.replicas | default 2 }}
  selector:
    matchLabels:
      app: {{ include "mcp-stack.fullname" . }}-varnish
  template:
    metadata:
      labels:
        app: {{ include "mcp-stack.fullname" . }}-varnish
    spec:
      containers:
        - name: varnish
          image: {{ .Values.envoyGateway.cache.image | default "varnish:7.6" }}
          args:
            - "-s"
            - "malloc,{{ .Values.envoyGateway.cache.memorySize | default "1G" }}"
            - "-a"
            - ":80"
          ports:
            - containerPort: 80
          volumeMounts:
            - name: vcl-config
              mountPath: /etc/varnish/default.vcl
              subPath: default.vcl
          resources:
            limits:
              cpu: {{ .Values.envoyGateway.cache.resources.limits.cpu | default "2" }}
              memory: {{ .Values.envoyGateway.cache.resources.limits.memory | default "2Gi" }}
            requests:
              cpu: {{ .Values.envoyGateway.cache.resources.requests.cpu | default "500m" }}
              memory: {{ .Values.envoyGateway.cache.resources.requests.memory | default "1Gi" }}
      volumes:
        - name: vcl-config
          configMap:
            name: {{ include "mcp-stack.fullname" . }}-varnish-vcl
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ include "mcp-stack.fullname" . }}-varnish-vcl
  namespace: {{ .Release.Namespace }}
data:
  default.vcl: |
    vcl 4.1;

    backend default {
      .host = "{{ include "mcp-stack.fullname" . }}-gateway";
      .port = "4444";
      .connect_timeout = 5s;
      .first_byte_timeout = 60s;
      .between_bytes_timeout = 60s;
    }

    sub vcl_recv {
      # Only cache GET and HEAD
      if (req.method != "GET" && req.method != "HEAD") {
        return (pass);
      }

      # Don't cache admin, SSE, WebSocket
      if (req.url ~ "^/admin" || req.url ~ "/sse$" || req.url ~ "/ws$") {
        return (pass);
      }

      # Cache static assets
      if (req.url ~ "\.(css|js|png|jpg|jpeg|gif|ico|svg|woff|woff2)$") {
        return (hash);
      }

      # Cache API read endpoints for 5 minutes
      if (req.url ~ "^/(tools|servers|gateways|resources|prompts|version)$") {
        return (hash);
      }

      return (pass);
    }

    sub vcl_backend_response {
      # Static assets: 30 days
      if (bereq.url ~ "\.(css|js|png|jpg|jpeg|gif|ico|svg|woff|woff2)$") {
        set beresp.ttl = 30d;
      }
      # API endpoints: 5 minutes
      elsif (bereq.url ~ "^/(tools|servers|gateways|resources|prompts|version)$") {
        set beresp.ttl = 5m;
      }
    }

    sub vcl_deliver {
      if (obj.hits > 0) {
        set resp.http.X-Cache = "HIT";
      } else {
        set resp.http.X-Cache = "MISS";
      }
    }
{{- end }}

3. Update values.yaml

Add to charts/mcp-stack/values.yaml:

# Envoy Gateway configuration (alternative to ingress-nginx)
envoyGateway:
  enabled: false  # Set to true to use Envoy Gateway instead of ingress-nginx

  # Hostname for the gateway (use "*" for any)
  hostname: "*"

  # Number of Envoy proxy replicas
  replicas: 2

  # Resource limits
  resources:
    limits:
      cpu: "2"
      memory: "1Gi"
    requests:
      cpu: "500m"
      memory: "512Mi"

  # TLS configuration
  tls:
    enabled: false
    secretName: mcp-gateway-tls

  # Traffic policy (rate limiting, circuit breaking)
  trafficPolicy:
    enabled: true

  # Rate limiting
  rateLimit:
    requestsPerSecond: 3000

  # Circuit breaker
  circuitBreaker:
    maxConnections: 4096
    maxPendingRequests: 4096
    maxRequests: 4096

  # OpenTelemetry integration
  otel:
    enabled: false
    host: otel-collector
    port: 4317

  # Optional: Varnish caching layer
  cache:
    enabled: false
    replicas: 2
    image: varnish:7.6
    memorySize: "1G"
    resources:
      limits:
        cpu: "2"
        memory: "2Gi"
      requests:
        cpu: "500m"
        memory: "1Gi"

4. Add ServiceMonitor for Prometheus

servicemonitor.yaml:

{{- if and .Values.envoyGateway.enabled .Values.monitoring.enabled }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: {{ include "mcp-stack.fullname" . }}-envoy
  namespace: {{ .Release.Namespace }}
spec:
  selector:
    matchLabels:
      gateway.envoyproxy.io/owning-gateway-name: {{ include "mcp-stack.fullname" . }}
  endpoints:
    - port: metrics
      interval: 30s
      path: /stats/prometheus
{{- end }}

Prerequisites

Envoy Gateway must be installed in the cluster:

# Install Envoy Gateway
helm install eg oci://docker.io/envoyproxy/gateway-helm \
  --version v1.2.0 \
  -n envoy-gateway-system --create-namespace

# Verify CRDs
kubectl get crd | grep gateway

Tasks

  • Create Gateway API resource templates (GatewayClass, Gateway, HTTPRoute)
  • Create EnvoyProxy configuration template
  • Create BackendTrafficPolicy template (rate limiting, circuit breaking)
  • Add optional Varnish caching layer templates
  • Update values.yaml with envoyGateway configuration section
  • Add ServiceMonitor for Prometheus integration
  • Update Chart.yaml dependencies if needed
  • Create documentation for Envoy Gateway setup
  • Add example values for common configurations
  • Test with HTTP/2 enabled on gateway pods
  • Test SSE and WebSocket passthrough
  • Test rate limiting behavior
  • Test circuit breaker behavior
  • Benchmark performance comparison vs ingress-nginx

Acceptance Criteria

  1. Envoy Gateway deploys when envoyGateway.enabled=true
  2. Traffic routes correctly to gateway pods via HTTPRoute
  3. HTTP/2 works between Envoy and gateway pods
  4. SSE and WebSocket endpoints work with extended timeouts
  5. Rate limiting returns 429 when exceeded
  6. Circuit breaker opens on backend failures
  7. Prometheus metrics available via ServiceMonitor
  8. Optional Varnish cache layer works correctly
  9. TLS termination works when enabled

Migration Path

To migrate from ingress-nginx to Envoy Gateway:

# Disable ingress-nginx
ingress:
  enabled: false

# Enable Envoy Gateway
envoyGateway:
  enabled: true
  hostname: "mcp.example.com"
  tls:
    enabled: true
    secretName: mcp-gateway-tls

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance related items

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions