Support per user api-key for multi-tenant use case
🚀 Feature Description and Motivation
Background
Currently, vLLM only supports a single API key for authentication, making it difficult to share the inference engine across multiple tenants. Extending vLLM to support multiple keys is an option, but this would be a static solution. A more flexible approach is needed to handle multi-tenant API key management dynamically.
Proposed Solutions
Option 1: Extend vLLM to Support External Authentication
- vLLM integrates with an external authentication server to validate API keys dynamically.
- This approach allows for greater flexibility but introduces external dependencies. overhead is another concern
Option 2: Manage API Keys Outside of vLLM
Option 2a: User-Managed Authentication (Bring Your Own Stack)
- Users adopt an external authentication solution (e.g., Istio, OAuth, or API gateways) to manage API keys.
Option 2b: Extend AIBrix Gateway for Multi-Tenant API Key Management
- AIBrix Gateway already has a basic user concept and rate-limiting control.
- The extension would associate users with API keys, providing built-in multi-tenancy support.
Future Considerations
In addition to authentication, we want to support tenant-aware optimizations within vLLM. The gateway should attach tenant metadata (e.g., X-Tenant-ID, X-Priority, JWT claims) before forwarding the request to vLLM. This would enable the inference engine to make tenant-aware optimizations, such as priority-based scheduling or resource allocation.
Open Questions
- Which approach aligns best with the vLLM architecture?
- Should vLLM natively support dynamic authentication, or should this be handled externally?
- How can we ensure a smooth integration between vLLM and the authentication layer without introducing significant overhead?
/cc @simon-mo @robertgshaw2-redhat @gaocegege @kerthcet
Use Case
Support multi-tenancy for vLLM
Proposed Solution
No response
One use-case I'd love to see supported as a tenant-aware optimization is tenant-based LoRA adapters.
@Jeffwan - aren't you able to support multi-key auth by rolling your own nginx in front of this? Also - do you have this working well with Istio? I am trying to investigate things now and I'm worried about the introduction of another Gateway resource...
@ericmeadows
aren't you able to support multi-key auth by rolling your own nginx in front of this?
technically, yes. key can be fully managed out of aibrix scope. user can build this layer on their own. since vLLM (server side) provides setting up the key, I am thinking whether we can do better integration here. or just remove authentication at vLLM layer and fully expose to upstream system to provides the protection.
do you have this working well with Istio?
We have not integrated with istio yet and I think istio authN could play similar role.
Feel free to share your thoughts and we can have some more discussion. Now. this should be a sub-task of https://github.com/vllm-project/aibrix/issues/1101