alpa
alpa copied to clipboard
RFC for alpa api authentication
- issue #700
@rjiang-ptm thanks!
@merrymercy @zhuohan123 please comment
Thank you for developing this great framework and for making the API public!
I was directed here by Hao, and I'm looking for programmatic access to the OPT interface. I believe will be used in the following ways:
-
Generation: Using the
completions
API to generate text given a prompt -
Classification: Still using the
completions
API, but restrict the output tokens, and use their log probabilities for classification (e.g., this)
I hope the above makes sense. I noticed this pull request appears to need some work - Please let me know if there's anything I can help with or if you have any questions.
In case any of the requested features is not available, I'll be happy to help create it :)
We did some refactoring and introduce a multi-model serving controller.
The new priority and batching logic can be implemented here https://github.com/alpa-projects/alpa/blob/32468ee67b1f9222fcabcc62c916c8326ee8ce9f/examples/llm_serving/launch_model_worker.py#L82-L108
The authentication can probably be handled as a middleware here https://github.com/alpa-projects/alpa/blob/32468ee67b1f9222fcabcc62c916c8326ee8ce9f/alpa/serve/controller.py#L227-L245
@rjiang-ptm since we have agreed (in email) to proceed with Option 3. maybe we can start the implementation (w/ MBZUAI team)?
Closed due to inactivity. Feel free to reopen if you have new progress.