loky
loky copied to clipboard
Use with Flask
Apologies if this is not the right place, but I'd appreciate your input and any pointers in any case:
I am using Flask to service requests to a function that calls opencv at the back. I am trying to parallelize the function using Processes rather than Threads, but there is a known issue with opencv and Processes which pointed me to Loky via https://github.com/opencv/opencv/issues/5150#issuecomment-400727184
Using the Loky executor I end up with Flask context problems (see below).
Do you know if this is a solved problem (Loky + Flask) or do you know of any fundamental showstoppers?
Thanks so much in advance for all assistance
File "/anaconda2/lib/python2.7/site-packages/loky/_base.py", line 433, in result
return self.__get_result()
File "/anaconda2/lib/python2.7/site-packages/loky/_base.py", line 381, in __get_result
raise self._exception
BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
This was caused directly by
'''
Traceback (most recent call last):
File "/anaconda2/lib/python2.7/site-packages/loky/process_executor.py", line 391, in _process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "/anaconda2/lib/python2.7/multiprocessing/queues.py", line 135, in get
res = self._recv()
File "myfile.py", line 39, in <module>
with app.app_context():
File "/anaconda2/lib/python2.7/site-packages/werkzeug/local.py", line 348, in __getattr__
return getattr(self._get_current_object(), name)
File "/anaconda2/lib/python2.7/site-packages/werkzeug/local.py", line 307, in _get_current_object
return self.__local()
File "/anaconda2/lib/python2.7/site-packages/flask/globals.py", line 52, in _find_app
raise RuntimeError(_app_ctx_err_msg)
RuntimeError: Working outside of application context.
This typically means that you attempted to use functionality that needed
to interface with the current application object in some way. To solve
this, set up an application context with app.app_context(). See the
documentation for more information.
'''
The cause of the problem seems to be that you are accessing: app.app_context()
from the child process. This context does not exist in the child process as it was initialized in the main process.
I am not familiar with flask but you probably either need to:
- do not use any of the flask app context API in the function to be executed in the workers. Write your function in a way that only focuses on the computational part without accessing any other objects besides what is passed as an argument to the function;
- alternatively pass the app context explicitly as an argument to your parallel function calls if the app context is pickleable but I am not sure if that can cause other kind of problems because I do not know what kind of objects this context holds. For instance it has connections to a DB I am not sure we want the worker processes to re-connect automagically to the DB.
The important line in the traceback is:
File "myfile.py", line 39, in <module>
with app.app_context():
Massive thanks for the speedy reply @ogrisel
I am tracking this on SO at https://stackoverflow.com/questions/58093735/flask-how-do-i-successfully-use-multiprocessing-not-multithreading
flask-executor is now taking care of parallel processing for me for non-opencv calls. I am wondering what Loky is doing that makes it work for opencv, and whether I can get flask-executor to take Loky as a custom executor.
If I simply hack the flask-executor code to take a Loky get_reusable_executor(), I get the context problem outlined above.
Any and all help much appreciated
Please provide a minimal reproducible example.
Please, please, please, try to make it as minimal as possible while still reproducing the error.
Otherwise it's unlikely that anybody will be able to help you.
For instance is a minimal python program that uses flask, loky and opencv. But it does not reproduce the problem you have:
from flask import Flask, escape, request
import cv2
from loky import get_reusable_executor
app = Flask(__name__)
def opencv_work(img):
hist = cv2.calcHist([img], [0], None, [256], [0,256])
return hist
@app.route('/')
def hello():
e = get_reusable_executor()
img = cv2.imread('image.png', 0)
f = e.submit(opencv_work, img)
return f'Computed: {f.result()} in a loky subprocess'
to launch it, I saved it as a file named "flask_loky_opencv.py", I put a png file in the same folder named "image.png" and I executed the following command:
$ env FLASK_APP=flask_loky_opencv.py flask run
* Serving Flask app "flask_loky_opencv.py"
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [25/Sep/2019 16:00:35] "GET / HTTP/1.1" 200 -
When I open the flask URL I can see the text message with the histogram values as expected and no error in the flask logs.
@ogrisel I am so sorry for bothering you and your assistance is much appreciated.
Thank you so much for taking the trouble to write the MRE which I confirm also works for me.
The problem is I think related to my architecture so do feel free to close the issue if you like. To summarize:
There was indeed a Flask context issue which I am more confident about solving now.
Flask aside, I think the problem is in fact my attempt to submit the jobs from within a module, as follows (NB python 2.7 and macOS):
main.py:
import mymodule
mymodule.py:
from loky import get_reusable_executor
import cv2
def opencv_work2(img):
print 'I am a worker'
color = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
hist = cv2.calcHist([color[0]], [0], None, [256], [0,256])
return hist
e = get_reusable_executor(max_workers=1)
img = cv2.imread('image.png', 0)
print 'image loaded'
f = [e.submit(opencv_work2, img) for i in [0]] # apologies for syntax
print 'submitted ok'
for x in iter(f): # apologies for syntax
print x.result()
When executing python main.py
, the code gets as far as 'submitted OK' and then just hangs.
Do you agree that this could be because the submission is not 'guarded' by name=='main'?
If so, this seems to be a showstopper: my functions are deliberately not executed in main.py, which I reserve for the Flask endpoints.
Do you know of any way at all to create processes (using loky or otherwise) from within a nested/imported module?
Is there any other workaround?
Huge thanks once again
@jtlz2 I am not sure what you are doing here? Is there a typo and the executor should be created during mymodule
import or should it be in the opencv_work2
, in which case you seem to be creating an infinite recursion.
The main issue I coul see here if you try to spawn the worker directly in the module import, is that it would try to spawn many imbracated executor. I would rather write it like:
# mymodule.py
from loky import get_reusable_executor
import cv2
def opencv_work2(img):
print 'I am a worker'
color = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
hist = cv2.calcHist([color[0]], [0], None, [256], [0,256])
return hist
def init_module():
e = get_reusable_executor(max_workers=1)
img = cv2.imread('image.png', 0)
print 'image loaded'
f = [e.submit(opencv_work2, img) for i in [0]] # apologies for syntax
print 'submitted ok'
for x in iter(f): # apologies for syntax
print x.result()
# main.py
from mymodule import init_module
init_module()