Sharing model among worker process
It is a question rather than an issue. I have gone over your code to see if there is a solution to share the model (nlp, prediction etc..) among worker processes to prevent load model for every worker and utilize async definition (which is another subject/problem) but could not see a solution. Is there something you can advice or apply in this skeleton?
Thanks.
Hi @mehmetilker one approach used in this method is to utilize the app's state to share the model within the app.
When talking about distributed workers, you can load the model as a singleton for each worker. For large models I can recommend to prefetch them from an object storage (e.g. S3, COS) to memory (e.g. using Redis), so that each worker can load the model initially.
Does this help?
Hi @eightBEC
Using App's state to load an instance for once for the whole application life time but it is meaningful if an application works on single worker. I am starting my application with following configuration. With the current approach model state loaded to memory separately. If we have 5 worker process it means 5*1.5GB
command=/home/xproj/.env/bin/gunicorn
app.modelsApi.main:app
-w 5
-k uvicorn.workers.UvicornWorker
--name gunicorn_models_api
--bind 0.0.0.0:9200
Your advice for the solution (loading from object store to memory) does not change the situation I think, if I understood you right.
Here is another question in SO: https://stackoverflow.com/questions/41988915/avoiding-loading-spacy-data-in-each-subprocess-when-multiprocessing
I haven't tried but I think related : https://docs.python.org/3/library/multiprocessing.shared_memory.html "This module provides a class, SharedMemory, for the allocation and management of shared memory to be accessed by one or more processes on a multicore or symmetric multiprocessor (SMP) machine."
Hi @mehmetilker, have you found a solution for this? I'm facing the same here.
@viniciusdsmello no unfortunately...