alpa icon indicating copy to clipboard operation
alpa copied to clipboard

RFC for alpa api authentication

Open rjiang-ptm opened this issue 2 years ago • 3 comments

  • issue #700

rjiang-ptm avatar Sep 20 '22 18:09 rjiang-ptm

@rjiang-ptm thanks!

@merrymercy @zhuohan123 please comment

zhisbug avatar Sep 20 '22 18:09 zhisbug

Thank you for developing this great framework and for making the API public!

I was directed here by Hao, and I'm looking for programmatic access to the OPT interface. I believe will be used in the following ways:

  • Generation: Using the completions API to generate text given a prompt
  • Classification: Still using the completions API, but restrict the output tokens, and use their log probabilities for classification (e.g., this)

I hope the above makes sense. I noticed this pull request appears to need some work - Please let me know if there's anything I can help with or if you have any questions.

In case any of the requested features is not available, I'll be happy to help create it :)

mingkaid avatar Sep 23 '22 04:09 mingkaid

We did some refactoring and introduce a multi-model serving controller.

The new priority and batching logic can be implemented here https://github.com/alpa-projects/alpa/blob/32468ee67b1f9222fcabcc62c916c8326ee8ce9f/examples/llm_serving/launch_model_worker.py#L82-L108

The authentication can probably be handled as a middleware here https://github.com/alpa-projects/alpa/blob/32468ee67b1f9222fcabcc62c916c8326ee8ce9f/alpa/serve/controller.py#L227-L245

merrymercy avatar Oct 01 '22 16:10 merrymercy

@rjiang-ptm since we have agreed (in email) to proceed with Option 3. maybe we can start the implementation (w/ MBZUAI team)?

zhisbug avatar Nov 15 '22 07:11 zhisbug

Closed due to inactivity. Feel free to reopen if you have new progress.

merrymercy avatar Dec 07 '22 19:12 merrymercy