aibrix icon indicating copy to clipboard operation
aibrix copied to clipboard

Documentation is not clearly defined on how to set the RateLimiting and how to measure the token consumption and how to enable the authentication for different users

Open vivekrsintc opened this issue 9 months ago • 5 comments

🐛 Describe the bug

https://aibrix.readthedocs.io/latest/features/gateway-plugins.html# The above documentation only talks about the feature not how to do it

Steps to Reproduce

https://aibrix.readthedocs.io/latest/features/gateway-plugins.html#

Expected behavior

Proper documentation on how to enable the Ratelimiting How to control the model access using what endpoints set these feature to be clearly called

Environment

All AIbrix

vivekrsintc avatar Mar 13 '25 09:03 vivekrsintc

To unblock you I am adding the details here, will add the document.

  • We have a separate metadata service, so it needs separate port forwarding. WIP to add under gateway umbrella which will remove separate port forwarding. kubectl -n aibrix-system port-forward svc/aibrix-metadata-service 8090:8090 &

  • Create a user, and specify RPM and TPM config. For more details please check out pkg/metadata/README.md

curl http://localhost:8090/CreateUser \
  -H "Content-Type: application/json" \
  -d '{"name": "your-user-name","rpm": 100,"tpm": 1000}'
  • Inference request: same as quick start, only need to add header for user.
curl -v http://localhost:8888/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer any_key" \
  -H "user: your-user-name" \
  -d '{
     "model": "llama2-7b",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'
  • For user config, we are also looking for more feedback from community, how companies generally do user and quota management.

varungup90 avatar Mar 13 '25 22:03 varungup90

In the above create user how the the Authentication key managed. How to assign each user with authentication key.

Also How we will measure user usage metrics like Token conumption agains the each user created and its authentication key

vivekrsintc avatar Mar 15 '25 18:03 vivekrsintc

In the above create user how the the Authentication key managed. How to assign each user with authentication key.

Authentication key present in the request is for the model, and not associated with user. Right now create user is pretty naive implementation and needs work to formalize it.

Also How we will measure user usage metrics like Token conumption agains the each user created and its authentication key

Gateway internally tracks token consumption in one minute window and does the validation to prevent user from over-consuming tokens or requests in one min window. Currently, token or request usage for each user is not exposed in metrics.

varungup90 avatar Mar 17 '25 19:03 varungup90

@varungup90 Thanks. I think adding this feature will be the benefit of this gateway feature of mapping authentication for model with respect each user or group of users right and associated token consumption exposing via usage metrics.

With respect to current implementation. How will I configure authentication with respect to model. that is also missing in the documentaion.

Could you please help me to understand how can a create user with authentication key mapped to model (as per your current feature) and associate that user with RPM & TPM limitations.

Thanks

vivekrsintc avatar Mar 18 '25 03:03 vivekrsintc

@vivekrsintc Right now features you have listed are not present. Before implementation, I need more understanding of the requirement. I could not find on aibrix slack channel. Can you ping me there.

varungup90 avatar Mar 24 '25 23:03 varungup90