Skip to content
Chimera readability score 64 out of 100, Academic reading level.

Every gateway ships with a set of built-in policies. Authentication. Rate limiting. Request routing. Prompt guards. These cover most use cases. But what about the ones they don’t cover?

What if you need to add a custom header based on a database lookup? What if you need to transform a request body in a way no existing filter supports? What if your business has unique logic that no off-the-shelf gateway can anticipate?

You build your own extension.

This article walks through exactly how to do that using agentgateway, Envoy, and Rust. In this tutorial, you’ll learn how to:

  • Build a custom Envoy dynamic module in Rust
  • Package it into a production-ready Docker image
  • Deploy it to Kubernetes with kgateway and agentgateway
  • Test the entire stack with a mock LLM endpoint

What you’ll need: Basic familiarity with Kubernetes, Docker, and command-line tools. No prior Rust experience required — I’ll explain the key parts as we go.

Time to complete: About 30-45 minutes.

Cost: Zero. Everything runs locally.

Architecture overview

Before diving into code, let’s understand what we’re building.

The lab routes a request through four layers:

  • A curl client sends a POST request

agentgateway-proxy

(Envoy) receives it- A custom Rust module transforms the request

httpbun

(a mock LLM) returns a fake response

curl → agentgateway-proxy → Rust Module (.so) → httpbun (mock LLM) → response

Here’s the complete architecture:

Everything runs locally on your laptop using kind (Kubernetes in Docker). No cloud costs. No API keys. The Rust module can be replaced with any transformation logic you need — the lab just shows the mechanism.

The stack

Here’s what each tool does:

| Tool | Purpose |

| kind | Creates a local Kubernetes cluster on your laptop |

| kgateway + agentgateway | Control plane that manages Envoy and handles Gateway API resources |

| Envoy | The proxy that sits between your client and backend, processing every request |

| Rust | Your custom transformation code, compiled into a shared library that Envoy loads at runtime |

| httpbun | A mock LLM that returns fake responses (no API key required) |

Everything is open source. Everything runs locally. You don’t need to spend a dime to follow along.

Before you start

Make sure you have these tools installed:

  • Docker (latest) – Runs containers, including your Kubernetes cluster and the Envoy proxy
  • kind (v0.20+) – Creates a local Kubernetes cluster
  • kubectl (v1.27+) – Talks to your Kubernetes cluster
  • Helm (v3.10+) – Installs kgateway and agentgateway packages
  • Rust (1.85+) – Builds the Rust module (optional; you can build inside Docker)

Create your cluster:

kind create cluster --name ai-gateway-lab

This command spins up a local Kubernetes cluster. All your gateway components will run inside it, isolated from your main system.

Part 1: The Rust module

The Rust code is split into two crates. Think of crates as folders that each contain a small library:

  • rustformations – The main Envoy filter that contains your transformation logic
  • transformations – A helper library that provides Jinja templating and shared transformation traits

Project Structure

rust/

├── rustformations/

│ ├── Cargo.toml

│ └── src/

│ ├── lib.rs # Registers the filter with Envoy

│ └── http_simple_mutations.rs # Your actual transformation logic

└── transformations/

├── Cargo.toml

└── src/

├── lib.rs # Defines transformation traits

└── jinja.rs # Jinja templating for dynamic transformations

The Cargo.toml file

Every Rust project has a Cargo.toml file. It lists dependencies and build instructions. Here’s what ours looks like:

[package]

name = "rustformations"

version = "0.1.0"

edition = "2021"

[dependencies]

The Envoy SDK – tells Rust how to talk to Envoy's C ABI

envoy-proxy-dynamic-modules-rust-sdk = { path = "../patched-envoy-sdk/..." }

Serialization – for parsing JSON requests and responses

serde = { version = "1.0", features = ["derive"] }

serde_json = "1.0"

Templating – for dynamic prompt transformations

minijinja = { version = "2.12.0", features = ["loader"] }

Our helper library

transformations = { path = "../transformations" }

Error handling and shared state

anyhow = "1.0.100"

once_cell = "1.21.3"

[lib]

name = "rust_module"

path = "src/lib.rs"

crate-type = ["cdylib"] # Creates a .so file that Envoy can load

Key dependencies explained:

crate-type = ["cdylib"]

– This is the most important line. It tells Rust to compile your code into a C-compatible shared library (.so file). Envoy can load this file at runtime without restarting.envoy-proxy-dynamic-modules-rust-sdk

– The official SDK that provides bindings between Rust and Envoy’s C API.minijinja

– A templating engine that lets you dynamically transform prompts using templates.serde

– Converts JSON requests into Rust structs and back.

The transformation trait

A “trait” in Rust is like a contract. It says “If you want to be a transformation filter, you must implement these functions.”

pub trait TransformationOps {

// Add a new header to the request (appends if header exists)

fn add_request_header(&mut self, key: &str, value: &[u8]) -> bool;

// Set a header (overwrites if it exists)

fn set_request_header(&mut self, key: &str, value: &[u8]) -> bool;

// Remove a header entirely

fn remove_request_header(&mut self, key: &str) -> bool;

// Same as above but for responses

fn add_response_header(&mut self, key: &str, value: &[u8]) -> bool;

fn set_response_header(&mut self, key: &str, value: &[u8]) -> bool;

fn remove_response_header(&mut self, key: &str) -> bool;

// Parse the request body as JSON so you can read and modify it

fn parse_request_json_body(&mut self) -> Result<JsonValue>;

// Get the raw request body as bytes

fn get_request_body(&mut self) -> Vec<u8>;

// ... more methods for streaming bodies, responses, etc.

}

What this means: When Envoy calls your Rust module, it gives you access to the request headers, request body, response headers, and response body at different points in the request lifecycle. You can read, modify, or replace anything you need.

You don’t need to implement all of them for a simple filter. Start with the headers you want to change and grow from there.

Part 2: The Docker image

We need to package Envoy with our Rust module into a single Docker image. This Dockerfile uses a multi-stage build to keep the final image small.

Stage 1: Build the Rust module

FROM rust:1.85 AS builder

WORKDIR /build

Install clang – needed to compile C bindings for the Envoy SDK

RUN apt-get update && apt-get install -y clang

Copy all Rust source code into the container

COPY rustformations/ ./rustformations/

COPY transformations/ ./transformations/

COPY patched-envoy-sdk/ ./patched-envoy-sdk/

Build the Rust module in release mode (optimized, no debug symbols)

WORKDIR /build/rustformations

RUN cargo build --release

Stage 2: Final Envoy image

FROM envoyproxy/envoy:v1.36.4

Install CA certificates – Envoy needs these to validate HTTPS backends

RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*

Copy the envoyinit wrapper binary (handles Envoy startup)

COPY envoyinit-linux-amd64 /usr/local/bin/envoyinit

RUN chmod +x /usr/local/bin/envoyinit

Copy the compiled Rust module from the builder stage

COPY --from=builder /build/rustformations/target/release/librust_module.so /usr/local/lib/

Copy the entrypoint script (decides how to start Envoy)

COPY docker-entrypoint.sh /

RUN chmod +x /docker-entrypoint.sh

Tell Envoy where to find dynamic modules

ENV ENVOY_DYNAMIC_MODULES_SEARCH_PATH=/usr/local/lib

Run as non-root for security

USER 10101

ENTRYPOINT ["/docker-entrypoint.sh"]

What each stage does:

  • Stage 1 (builder) – Compiles your Rust code. Uses a larger image with Rust and build tools. Creates the .so file.
  • Stage 2 (runtime) – Only contains Envoy and your compiled .so file. Keeps the final image small (319MB).

Build the image:

docker build -f Dockerfile.rust85 -t envoy-wrapper:test .

This creates a Docker image named envoy-wrapper:test that contains Envoy plus your custom Rust module. You can run this image anywhere Docker runs.

Part 3: Deploying to Kubernetes

Now we deploy everything to your local Kubernetes cluster.

  • Install Gateway API CRDs

kubectl apply -f

What this does: Installs the Custom Resource Definitions (CRDs) for Gateway API. These let you define Gateways, HTTPRoutes, and other routing resources in Kubernetes.

2. Install kgateway (Control Plane)

helm upgrade -i kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds \

--create-namespace --namespace kgateway-system \

--version v2.2.1

helm upgrade -i kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway \

--namespace kgateway-system \

--version v2.2.1

What this does: Installs kgateway, the control plane, into your cluster. It runs in the kgateway-system namespace and manages Envoy instances.

3. Install agentgateway (AI Data Plane)

helm upgrade -i agentgateway-crds oci://cr.agentgateway.dev/charts/agentgateway-crds \

--create-namespace --namespace agentgateway-system \

--version v1.1.0

helm upgrade -i agentgateway oci://cr.agentgateway.dev/charts/agentgateway \

--namespace agentgateway-system \

--version v1.1.0

What this does: Installs agentgateway, the AI-focused data plane that works alongside kgateway. This component actually handles AI traffic.

4. Deploy httpbun (the Mock LLM)

kubectl apply -f - <<EOF

apiVersion: apps/v1

kind: Deployment

metadata:

name: httpbun

namespace: default

spec:

replicas: 1

selector:

matchLabels:

app: httpbun

template:

metadata:

labels:

app: httpbun

spec:

containers:

  • name: httpbun

image: sharat87/httpbun

env:

  • name: HTTPBUN_BIND

value: "0.0.0.0:3090"

ports:

  • containerPort: 3090

apiVersion: v1

kind: Service

metadata:

name: httpbun

namespace: default

spec:

selector:

app: httpbun

ports:

  • protocol: TCP

port: 3090

targetPort: 3090

EOF

What this does: Deploys httpbun – a fake OpenAI-compatible LLM. It listens on port 3090 and returns mock responses. No API key needed.

5. Create the AgentgatewayBackend

kubectl apply -f - <<EOF

apiVersion: agentgateway.dev/v1alpha1

kind: AgentgatewayBackend

metadata:

name: httpbun-llm

namespace: agentgateway-system

spec:

ai:

provider:

openai:

model: gpt-4

host: httpbun.default.svc.cluster.local

port: 3090

path: "/llm/chat/completions"

EOF

What this does: Tell agentgateway that there’s an LLM backend at that address speaking the OpenAI API format.

6. Create the Gateway and HTTPRoute

kubectl apply -f - <<EOF

apiVersion: gateway.networking.k8s.io/v1

kind: Gateway

metadata:

name: agentgateway-proxy

namespace: agentgateway-system

spec:

gatewayClassName: agentgateway

listeners:

  • protocol: HTTP

port: 80

name: http

allowedRoutes:

namespaces:

from: All


apiVersion: gateway.networking.k8s.io/v1

kind: HTTPRoute

metadata:

name: httpbun-llm

namespace: agentgateway-system

spec:

parentRefs:

  • name: agentgateway-proxy

namespace: agentgateway-system

rules:

  • matches:
  • path:

type: PathPrefix

value: /v1/chat/completions

backendRefs:

  • name: httpbun-llm

namespace: agentgateway-system

group: agentgateway.dev

kind: AgentgatewayBackend

EOF

What this does:

  • Gateway – Creates an entry point for traffic. Listens on port 80.
  • HTTPRoute – Routes requests matching /v1/chat/completions to the httpbun backend.

Part 4: Testing it all works

  • Port-forward the gateway

kubectl port-forward -n agentgateway-system svc/agentgateway-proxy 8082:80

What this does: Forwards traffic from your laptop’s port 8082 to the gateway pod running in Kubernetes. This lets you test locally as if you were outside the cluster.

2. Send a test request

Open a new terminal and run:

curl -X POST http://localhost:8082/v1/chat/completions \

-H "Content-Type: application/json" \

-d '{"model":"gpt-4","messages":[{"role":"user","content":"Hello"}]}'

3. Expected response

{

"choices": [{

"message": {

"content": "This is a mock chat response from httpbun."

}

}]

}

If you see this, everything works:

  • The Rust module loaded successfully
  • The gateway routed the request correctly
  • The mock LLM responded

Troubleshooting common issues

Problem 1: Rust version mismatch

Error:

text

error: feature `edition2024` is required

Cause: Some Rust crates require newer compiler features. Your Rust version is too old.

Fix: Upgrade Rust in your Dockerfile from 1.75 to 1.85 or newer.

Problem 2: Missing ABI Symbol

Error:

text

undefined symbol: envoy_dynamic_module_callback_http_add_response_header

Cause: Your SDK doesn’t match your Envoy version. Envoy v1.36.4 expects certain functions that older SDKs don’t provide.

Fix: Copy the official SDK directly from the Envoy source:

bash

cp -r envoy/source/extensions/dynamic_modules/sdk/rust patched-envoy-sdk/

Problem 3: filter_config format

Error:

text

error parsing filter config: EOF while parsing a value

Cause: Envoy expects configuration to be wrapped in a protobuf Any type. Without the wrapper, it passes an empty object that your Rust code can’t parse.

Fix: Use the protobuf wrapper in your Envoy config:

yaml

filter_config:

"@type": type.googleapis.com/google.protobuf.StringValue

value: "{}"

Next steps: Production and real LLMs

This lab uses httpbun as a mock. To use a real LLM:

  • Get an API key from OpenAI, Anthropic, or Gemini
  • Create a Kubernetes secret with your key
  • Update the AgentgatewayBackend to use the real host and authentication

yaml

apiVersion: agentgateway.dev/v1alpha1

kind: AgentgatewayBackend

metadata:

name: openai

namespace: agentgateway-system

spec:

ai:

provider:

openai:

model: gpt-4

host: api.openai.com

port: 443

policies:

auth:

secretRef:

name: openai-secret

For production, also add:

  • Authentication (API keys, JWT, or mTLS)
  • Rate limiting to control costs
  • Observability (metrics, logs, tracing)
  • Deploy to a real Kubernetes cluster (EKS, GKE, or AKS)

agentgateway supports all of these through its policy CRDs.

Complete code

Everything is on GitHub: github.com/Mike-4-prog/ai-gateway-lab

The repo includes:

  • All Kubernetes manifests
  • Complete Rust source code
  • Multi-stage Dockerfile
  • Quick start README

You can clone it and run the entire lab in about 10 minutes.

Final thoughts

Building this lab taught me three things:

  • Extending agentgateway with Rust is powerful but strict. The SDK must match Envoy exactly. The Rust version must support your dependencies. One version mismatch and everything breaks.
  • The filter_config format is not obvious. The protobuf wrapper is documented, but easy to miss. I spent hours on this error before finding the solution in the docs.
  • Starting with a mock LLM saves time and money. httpbun let me focus on the gateway, not the AI provider. I could test everything locally without worrying about API keys or costs.

If you’re building on agentgateway and need a capability that doesn’t exist yet, you now know how to build it yourself.

Questions? Find me on GitHub.

Special thanks to Art Berger and the kgateway team for their guidance and encouragement.

Facts Only

The tutorial teaches how to build a custom Envoy dynamic module in Rust.
The module is packaged into a Docker image and deployed to Kubernetes.
The stack includes kind (local Kubernetes), kgateway, agentgateway, Envoy, and a mock LLM (httpbun).
The Rust module transforms HTTP requests and responses using a shared library (.so file).
The project structure includes two Rust crates: `rustformations` (main filter) and `transformations` (helper library).
The Dockerfile uses a multi-stage build to compile Rust code and create a lightweight Envoy image.
Kubernetes deployment involves installing Gateway API CRDs, kgateway, agentgateway, and the mock LLM.
Testing is done via port-forwarding and curl requests to the local gateway.
Common issues include Rust version mismatches and missing ABI symbols in the Envoy SDK.
The tutorial assumes basic familiarity with Kubernetes, Docker, and command-line tools.
All components run locally using kind, with no cloud costs or API keys required.
The complete code is available on GitHub.

Executive Summary

This tutorial demonstrates how to extend an AI gateway system using custom Rust modules with Envoy, deployed on Kubernetes. The process involves creating a Rust-based Envoy filter to modify HTTP requests and responses, packaging it into a Docker image, and deploying it alongside agentgateway and kgateway. The lab uses a mock LLM (httpbun) for testing, allowing local development without API costs. Key steps include writing Rust code that implements transformation logic, compiling it into a shared library, and configuring Kubernetes resources to route traffic through the custom filter. The setup runs entirely on a local Kubernetes cluster (kind) and requires familiarity with Docker, Kubernetes, and basic command-line tools. The tutorial emphasizes practical implementation, including troubleshooting common issues like version mismatches and configuration errors.
The approach highlights the flexibility of extending gateway functionality beyond built-in policies, such as adding custom headers or transforming request bodies. While the example focuses on a simple transformation, the framework supports complex business logic. The tutorial also notes the importance of matching SDK versions with Envoy releases and provides guidance for transitioning to real LLM backends in production. Overall, it serves as a practical guide for developers needing to customize AI gateway behavior without relying solely on pre-built features.

Full Take

This tutorial exemplifies the growing trend of customizing AI infrastructure to meet specific business needs, moving beyond off-the-shelf solutions. The use of Rust for extending Envoy highlights the demand for performance and safety in low-level gateway modifications, while Kubernetes deployment underscores the dominance of containerized, cloud-native architectures. The mock LLM approach is a pragmatic choice, reducing barriers to experimentation and avoiding vendor lock-in during development.
However, the tutorial assumes a level of technical proficiency that may exclude less experienced developers, particularly in debugging version mismatches and SDK compatibility issues. The reliance on specific versions of Envoy, Rust, and Kubernetes components introduces fragility—what works today may break with future updates. This reflects a broader challenge in the AI infrastructure space: the rapid evolution of tools often outpaces documentation and community support.
The pattern of extending gateways with custom logic raises questions about maintainability and security. While powerful, custom modules increase attack surfaces and require rigorous testing. The tutorial’s focus on local development is a strength, but transitioning to production would necessitate additional safeguards, such as authentication, rate limiting, and observability—areas only briefly mentioned.
**Bridge Questions:**
How might the maintenance burden of custom modules compare to waiting for upstream features in gateway projects?
What trade-offs exist between using Rust for performance and higher-level languages for developer productivity in this context?
How could this approach be adapted for non-AI use cases, such as legacy system integrations?
**Counterstrike Scan:**
A coordinated influence campaign might use this tutorial to promote a specific tech stack (Rust + Envoy + Kubernetes) as the only viable solution for AI gateway customization, downplaying alternatives like Python-based middleware or managed services. However, the content itself is technically neutral, focusing on implementation rather than advocacy. No manipulation patterns detected.
**Patterns detected:** none