loky icon indicating copy to clipboard operation
loky copied to clipboard

Use with Flask

Open jtlz2 opened this issue 4 years ago • 6 comments

Apologies if this is not the right place, but I'd appreciate your input and any pointers in any case:

I am using Flask to service requests to a function that calls opencv at the back. I am trying to parallelize the function using Processes rather than Threads, but there is a known issue with opencv and Processes which pointed me to Loky via https://github.com/opencv/opencv/issues/5150#issuecomment-400727184

Using the Loky executor I end up with Flask context problems (see below).

Do you know if this is a solved problem (Loky + Flask) or do you know of any fundamental showstoppers?

Thanks so much in advance for all assistance

  File "/anaconda2/lib/python2.7/site-packages/loky/_base.py", line 433, in result
    return self.__get_result()
  File "/anaconda2/lib/python2.7/site-packages/loky/_base.py", line 381, in __get_result
    raise self._exception
BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
This was caused directly by
'''
Traceback (most recent call last):
  File "/anaconda2/lib/python2.7/site-packages/loky/process_executor.py", line 391, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
  File "/anaconda2/lib/python2.7/multiprocessing/queues.py", line 135, in get
    res = self._recv()
  File "myfile.py", line 39, in <module>
    with app.app_context():
  File "/anaconda2/lib/python2.7/site-packages/werkzeug/local.py", line 348, in __getattr__
    return getattr(self._get_current_object(), name)
  File "/anaconda2/lib/python2.7/site-packages/werkzeug/local.py", line 307, in _get_current_object
    return self.__local()
  File "/anaconda2/lib/python2.7/site-packages/flask/globals.py", line 52, in _find_app
    raise RuntimeError(_app_ctx_err_msg)
RuntimeError: Working outside of application context.

This typically means that you attempted to use functionality that needed
to interface with the current application object in some way. To solve
this, set up an application context with app.app_context().  See the
documentation for more information.
'''

jtlz2 avatar Sep 25 '19 06:09 jtlz2

The cause of the problem seems to be that you are accessing: app.app_context() from the child process. This context does not exist in the child process as it was initialized in the main process.

I am not familiar with flask but you probably either need to:

  • do not use any of the flask app context API in the function to be executed in the workers. Write your function in a way that only focuses on the computational part without accessing any other objects besides what is passed as an argument to the function;
  • alternatively pass the app context explicitly as an argument to your parallel function calls if the app context is pickleable but I am not sure if that can cause other kind of problems because I do not know what kind of objects this context holds. For instance it has connections to a DB I am not sure we want the worker processes to re-connect automagically to the DB.

The important line in the traceback is:

  File "myfile.py", line 39, in <module>
    with app.app_context():

ogrisel avatar Sep 25 '19 08:09 ogrisel

Massive thanks for the speedy reply @ogrisel

I am tracking this on SO at https://stackoverflow.com/questions/58093735/flask-how-do-i-successfully-use-multiprocessing-not-multithreading

flask-executor is now taking care of parallel processing for me for non-opencv calls. I am wondering what Loky is doing that makes it work for opencv, and whether I can get flask-executor to take Loky as a custom executor.

If I simply hack the flask-executor code to take a Loky get_reusable_executor(), I get the context problem outlined above.

Any and all help much appreciated

jtlz2 avatar Sep 25 '19 09:09 jtlz2

Please provide a minimal reproducible example.

Please, please, please, try to make it as minimal as possible while still reproducing the error.

Otherwise it's unlikely that anybody will be able to help you.

ogrisel avatar Sep 25 '19 13:09 ogrisel

For instance is a minimal python program that uses flask, loky and opencv. But it does not reproduce the problem you have:

from flask import Flask, escape, request
import cv2
from loky import get_reusable_executor


app = Flask(__name__)


def opencv_work(img):
    hist = cv2.calcHist([img], [0], None, [256], [0,256])
    return hist


@app.route('/')
def hello():
    e = get_reusable_executor()
    img = cv2.imread('image.png', 0)
    f = e.submit(opencv_work, img)
    return f'Computed: {f.result()} in a loky subprocess'

to launch it, I saved it as a file named "flask_loky_opencv.py", I put a png file in the same folder named "image.png" and I executed the following command:

$ env FLASK_APP=flask_loky_opencv.py flask run
 * Serving Flask app "flask_loky_opencv.py"
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [25/Sep/2019 16:00:35] "GET / HTTP/1.1" 200 -

When I open the flask URL I can see the text message with the histogram values as expected and no error in the flask logs.

ogrisel avatar Sep 25 '19 14:09 ogrisel

@ogrisel I am so sorry for bothering you and your assistance is much appreciated.

Thank you so much for taking the trouble to write the MRE which I confirm also works for me.

The problem is I think related to my architecture so do feel free to close the issue if you like. To summarize:

There was indeed a Flask context issue which I am more confident about solving now.

Flask aside, I think the problem is in fact my attempt to submit the jobs from within a module, as follows (NB python 2.7 and macOS):

main.py:

import mymodule

mymodule.py:

from loky import get_reusable_executor
import cv2

def opencv_work2(img):
    print 'I am a worker'
    color = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
    hist = cv2.calcHist([color[0]], [0], None, [256], [0,256])
    return hist

    e = get_reusable_executor(max_workers=1)
    img = cv2.imread('image.png', 0)
    print 'image loaded'

    f = [e.submit(opencv_work2, img) for i in [0]] # apologies for syntax
    print 'submitted ok'
    for x in iter(f): # apologies for syntax
        print x.result()

When executing python main.py, the code gets as far as 'submitted OK' and then just hangs.

Do you agree that this could be because the submission is not 'guarded' by name=='main'?

If so, this seems to be a showstopper: my functions are deliberately not executed in main.py, which I reserve for the Flask endpoints.

Do you know of any way at all to create processes (using loky or otherwise) from within a nested/imported module?

Is there any other workaround?

Huge thanks once again

jtlz2 avatar Sep 26 '19 07:09 jtlz2

@jtlz2 I am not sure what you are doing here? Is there a typo and the executor should be created during mymodule import or should it be in the opencv_work2, in which case you seem to be creating an infinite recursion.

The main issue I coul see here if you try to spawn the worker directly in the module import, is that it would try to spawn many imbracated executor. I would rather write it like:

# mymodule.py
from loky import get_reusable_executor
import cv2

def opencv_work2(img):
    print 'I am a worker'
    color = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
    hist = cv2.calcHist([color[0]], [0], None, [256], [0,256])
    return hist

def init_module():
    e = get_reusable_executor(max_workers=1)
    img = cv2.imread('image.png', 0)
    print 'image loaded'

    f = [e.submit(opencv_work2, img) for i in [0]] # apologies for syntax
    print 'submitted ok'
    for x in iter(f): # apologies for syntax
        print x.result()

# main.py
from mymodule import init_module
init_module()

tomMoral avatar Sep 26 '19 07:09 tomMoral