sample-app-aoai-chatGPT icon indicating copy to clipboard operation
sample-app-aoai-chatGPT copied to clipboard

Refactor app backend to async

Open sarah-widder opened this issue 2 years ago • 3 comments

This refactor is intended to improve the performance of the app by making the backend routes asynchronous and configuring the gunicorn worker count to support more concurrent requests.

  • Replace Flask with Quart to support ASGI
  • Upgrade openai to v1, use AsyncAzureOpenAI client for both with and without data
  • Use async client for chat history CosmosDB
  • Make all associated routes async
  • Add gunicorn config to dynamically set default worker count

To run locally:

  1. Install requirements python -m pip install -r requirements.txt
  2. Set up your .env file and run the VSCode debug configuration "Python: Quart"
  3. See app at http://127.0.0.1:50505/

To deploy:

  1. Run az webapp up --name <app name> --runtime "Python:3.11" --sku "B1" from the root of the repo.
  2. Change the startup command on the app service to python3 -m gunicorn main:app
  3. Restart the app.

To do:

  • Thorough testing to ensure all previous functionality is supported
  • Load testing with different gunicorn settings
  • Update documentation
  • Update deployment scripts and templates
  • Update Azure AI Studio deployment flow

sarah-widder avatar Dec 16 '23 01:12 sarah-widder

Hi @sarah-widder,

This is great! Do you have an ETA on when it might be merged?

mrn-linak avatar Jan 19 '24 09:01 mrn-linak

Hi @sarah-widder,

This is great! Do you have an ETA on when it might be merged?

Hi @mrn-linak hoping for the end of this week. Thanks for your interest :)

sarah-widder avatar Jan 23 '24 18:01 sarah-widder

Hi @sarah-widder, This is great! Do you have an ETA on when it might be merged?

Hi @mrn-linak hoping for the end of this week. Thanks for your interest :)

Amazing! Thanks for the reply :-)

mrn-linak avatar Jan 24 '24 12:01 mrn-linak

This refactor is intended to improve the performance of the app by making the backend routes asynchronous and configuring the gunicorn worker count to support more concurrent requests.

* Replace Flask with Quart to support ASGI

* Upgrade openai to v1, use AsyncAzureOpenAI client for both with and without data

* Use async client for chat history CosmosDB

* Make all associated routes async

* Add gunicorn config to dynamically set default worker count

To run locally:

1. Install requirements `python -m pip install -r requirements.txt`

2. Set up your .env file and run the VSCode debug configuration "Python: Quart"

3. See app at http://127.0.0.1:50505/

To deploy:

1. Run `az webapp up --name <app name> --runtime "Python:3.11" --sku "B1" ` from the root of the repo.

2. Change the startup command on the app service to `python3 -m gunicorn main:app`

3. Restart the app.

To do:

* Update documentation

* Update deployment scripts and templates

JFYI, anyone who uses before deployment script updates, change from "main:app" to "app:app" as it is typo.

  1. Change the startup command on the app service to python3 -m gunicorn app:app

junan-trustarc avatar Feb 02 '24 16:02 junan-trustarc

@sarah-widder could it be that the start.sh / start.cmd scripts haven't been updated for these code changes?

iMicknl avatar Feb 02 '24 17:02 iMicknl