aibrix icon indicating copy to clipboard operation
aibrix copied to clipboard

[Discussion] Simplify AIBrix deployment by removing Envoy Gateway

Open Jeffwan opened this issue 2 months ago • 2 comments

🚀 Feature Description and Motivation

Today AIBrix recommends/ships Envoy Gateway, but we typically run a single gateway instance. Envoy Gateway adds installation and controller complexity that many users don’t need. Propose an option to deploy vanilla Envoy + ext-proc (External Processing), fully managed by AIBrix (no Envoy Gateway), to reduce footprint, speed up onboarding, and enable non-Kubernetes environments.

  • We deploy only one gateway instance in most setups.
  • Envoy Gateway brings CRDs, controllers, and a multi-component lifecycle that:
    • increases install time and RBAC surface,
    • complicates upgrades and troubleshooting,
    • couples us to Kubernetes even when users want a lighter path.
  • Users without K8s controllers (or not on K8s at all) still want AIBrix features (routing, rate limiting by tokens/sec, prefix-cache awareness, batch API paths, etc.).

Use Case

  • for non-kubernetes users
  • for users need simplification

Proposed Solution

  • Envoy runs as a single process with static or minimally-templated config (no Gateway API, no controllers).
  • Generate Envoy bootstrap/listener/cluster config from AIBrix control-plane templates (Helm/JSON/YAML or aibrixctl).
  • Hot-reload via SDS/xDS optional (envoy bootstrap points at our control-plane for dynamic clusters if needed).

Jeffwan avatar Oct 11 '25 18:10 Jeffwan

Do we need to refer to or research other products such as Dynamo LLM-D and other cloud-native inference services? 🤔

googs1025 avatar Oct 12 '25 02:10 googs1025

@googs1025 We can check other solutions as references.

Jeffwan avatar Oct 13 '25 03:10 Jeffwan