aibrix icon indicating copy to clipboard operation
aibrix copied to clipboard

Support per user api-key for multi-tenant use case

Open Jeffwan opened this issue 10 months ago • 3 comments

🚀 Feature Description and Motivation

Background

Currently, vLLM only supports a single API key for authentication, making it difficult to share the inference engine across multiple tenants. Extending vLLM to support multiple keys is an option, but this would be a static solution. A more flexible approach is needed to handle multi-tenant API key management dynamically.

Proposed Solutions

Option 1: Extend vLLM to Support External Authentication

  • vLLM integrates with an external authentication server to validate API keys dynamically.
  • This approach allows for greater flexibility but introduces external dependencies. overhead is another concern

Option 2: Manage API Keys Outside of vLLM

Option 2a: User-Managed Authentication (Bring Your Own Stack)

  • Users adopt an external authentication solution (e.g., Istio, OAuth, or API gateways) to manage API keys.

Option 2b: Extend AIBrix Gateway for Multi-Tenant API Key Management

  • AIBrix Gateway already has a basic user concept and rate-limiting control.
  • The extension would associate users with API keys, providing built-in multi-tenancy support.

Future Considerations

In addition to authentication, we want to support tenant-aware optimizations within vLLM. The gateway should attach tenant metadata (e.g., X-Tenant-ID, X-Priority, JWT claims) before forwarding the request to vLLM. This would enable the inference engine to make tenant-aware optimizations, such as priority-based scheduling or resource allocation.

Open Questions

  • Which approach aligns best with the vLLM architecture?
  • Should vLLM natively support dynamic authentication, or should this be handled externally?
  • How can we ensure a smooth integration between vLLM and the authentication layer without introducing significant overhead?

/cc @simon-mo @robertgshaw2-redhat @gaocegege @kerthcet

Use Case

Support multi-tenancy for vLLM

Proposed Solution

No response

Jeffwan avatar Feb 26 '25 23:02 Jeffwan

One use-case I'd love to see supported as a tenant-aware optimization is tenant-based LoRA adapters.

jolfr avatar Mar 01 '25 02:03 jolfr

@Jeffwan - aren't you able to support multi-key auth by rolling your own nginx in front of this? Also - do you have this working well with Istio? I am trying to investigate things now and I'm worried about the introduction of another Gateway resource...

ericmeadows avatar Jun 29 '25 19:06 ericmeadows

@ericmeadows

aren't you able to support multi-key auth by rolling your own nginx in front of this?

technically, yes. key can be fully managed out of aibrix scope. user can build this layer on their own. since vLLM (server side) provides setting up the key, I am thinking whether we can do better integration here. or just remove authentication at vLLM layer and fully expose to upstream system to provides the protection.

do you have this working well with Istio?

We have not integrated with istio yet and I think istio authN could play similar role.

Feel free to share your thoughts and we can have some more discussion. Now. this should be a sub-task of https://github.com/vllm-project/aibrix/issues/1101

Jeffwan avatar Jun 30 '25 02:06 Jeffwan