aibrix
aibrix copied to clipboard
[Discussion] Simplify AIBrix deployment by removing Envoy Gateway
🚀 Feature Description and Motivation
Today AIBrix recommends/ships Envoy Gateway, but we typically run a single gateway instance. Envoy Gateway adds installation and controller complexity that many users don’t need. Propose an option to deploy vanilla Envoy + ext-proc (External Processing), fully managed by AIBrix (no Envoy Gateway), to reduce footprint, speed up onboarding, and enable non-Kubernetes environments.
- We deploy only one gateway instance in most setups.
- Envoy Gateway brings CRDs, controllers, and a multi-component lifecycle that:
- increases install time and RBAC surface,
- complicates upgrades and troubleshooting,
- couples us to Kubernetes even when users want a lighter path.
- Users without K8s controllers (or not on K8s at all) still want AIBrix features (routing, rate limiting by tokens/sec, prefix-cache awareness, batch API paths, etc.).
Use Case
- for non-kubernetes users
- for users need simplification
Proposed Solution
- Envoy runs as a single process with static or minimally-templated config (no Gateway API, no controllers).
- Generate Envoy bootstrap/listener/cluster config from AIBrix control-plane templates (Helm/JSON/YAML or aibrixctl).
- Hot-reload via SDS/xDS optional (envoy bootstrap points at our control-plane for dynamic clusters if needed).
Do we need to refer to or research other products such as Dynamo LLM-D and other cloud-native inference services? 🤔
@googs1025 We can check other solutions as references.