chat-with-your-data-solution-accelerator icon indicating copy to clipboard operation
chat-with-your-data-solution-accelerator copied to clipboard

Use async framework, not Flask

Open pamelafox opened this issue 2 years ago • 4 comments
trafficstars

Motivation

See my blog post here: http://blog.pamelafox.org/2023/09/best-practices-for-openai-chat-apps.html

We ported the other sample to Quart, as it was a more 1:1 mapping from Flask, but the more popular async framework is FastAPI.

Such a change will enable users to handle more requests with less resources / lower SKUs.

How would you feel if this feature request was implemented?

efficient

Requirements

  • Switch to a more efficient application framework

Tasks

To be filled in by the engineer picking up the issue

  • [ ] Task 1
  • [ ] Task 2
  • [ ] ...

pamelafox avatar Oct 16 '23 19:10 pamelafox

Note that this may affect deployability if you're relying on App Service's auto detection for app startup script, as it doesn't yet detect any async frameworks.

pamelafox avatar Oct 16 '23 20:10 pamelafox

Thank you so much, @pamelafox we'll triage this change and prioritize for deployment accordingly.

gmndrg avatar Oct 19 '23 04:10 gmndrg

Question: The docker files use uWSGI, wouldn't that already support parallelism? At least in terms of parallel users.

ashikns avatar Dec 05 '23 13:12 ashikns

@ashikns Yes, you can have multiple workers/threads when running a WSGI app using gunicorn on a multi-core machine. However, if all those workers are tied up waiting for the results of an API call, they can't respond to new user requests. With async calls and frameworks, a worker can service a new request while waiting for the results. You can serve more users with less cores.

pamelafox avatar Dec 05 '23 15:12 pamelafox

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] avatar Sep 09 '24 02:09 github-actions[bot]