sukima icon indicating copy to clipboard operation
sukima copied to clipboard

Use isolated workers for heavy operations (e.g. inference)

Open zielo-hue opened this issue 3 years ago • 1 comments

The app becomes completely unresponsive when processing requests that involve transformers. Using a queueing system like Celery that offloads such heavy tasks onto separate workers will greatly improve the end-user experience and also make the webapp much more scalable. I recommend using RabbitMQ as the broker and Redis as the backend for Celery, since those seem to be the most widely used.

zielo-hue avatar Nov 24 '21 05:11 zielo-hue

it looks like the GPTHF module is also regenerating logits for every request, and the behavior of the app suggests that it is multi-threaded... those may contribute to the issue of the app becoming completely unresponsive

koukuno avatar Dec 01 '21 02:12 koukuno