thinc icon indicating copy to clipboard operation
thinc copied to clipboard

Compatibility issue with gunicorn

Open suvkaka opened this issue 4 years ago • 13 comments

I am trying to import spacy from gunicorn with worker gevent and Flask

File "/tf/notebooks/NLP//model_spacy.py", line 1, in import spacy File "/usr/local/lib/python3.6/dist-packages/spacy/init.py", line 10, in from thinc.api import prefer_gpu, require_gpu, require_cpu # noqa: F401 File "/usr/local/lib/python3.6/dist-packages/thinc/api.py", line 2, in from .initializers import normal_init, uniform_init, glorot_uniform_init, zero_init File "/usr/local/lib/python3.6/dist-packages/thinc/initializers.py", line 4, in from .backends import Ops File "/usr/local/lib/python3.6/dist-packages/thinc/backends/init.py", line 6, in from .ops import Ops File "/usr/local/lib/python3.6/dist-packages/thinc/backends/ops.py", line 10, in from ..util import get_array_module, is_xp_array, to_numpy File "/usr/local/lib/python3.6/dist-packages/thinc/util.py", line 15, in DATA_VALIDATION: ContextVar[bool] = ContextVar("DATA_VALIDATION", default=False) TypeError: 'type' object is not subscriptable

suvkaka avatar Mar 05 '21 15:03 suvkaka

I used the sample application from the gunicorn docs and imported spaCy at the top and it worked without issue. Could you give example code to reproduce your issue and clarify what version of spaCy, thinc, and Python you're using?

My code:

import spacy

def app(environ, start_response):
    """Simplest possible application object"""
    data = b'Hello, World!\n'
    status = '200 OK'
    response_headers = [
        ('Content-type', 'text/plain'),
        ('Content-Length', str(len(data)))
    ]
    start_response(status, response_headers)
    return iter([data])

polm avatar Mar 06 '21 10:03 polm

Hi, I am using gunicorn with flask, my apologies for not mentioning flask earlier.

Note:

  • the same code works perfectly when I run using Flask dev server i.e 'python faiss_controller.py'
  • but the error comes if I run with gunicorn i.e 'gunicorn --worker-class gevent --bind 0.0.0.0:5001 faiss_wsgi:app'

Following are the versions. Python==3.6.9 spacy==3.0.3 spacy-legacy==3.0.1 gunicorn==20.0.4 Flask==1.1.2 Flask-RESTful==0.3.8 gevent==21.1.2 thinc==8.0.1

I am using the following two .py files

  1. faiss_wsgi.py
  2. faiss_controller.py

Here is the code

faiss_wsgi.py

from faiss_controller import app

if __name__ == "__main__":
    app.run()

faiss_controller.py

import spacy


import json
from flask import Flask
from flask import request,g
from flask_restful import reqparse
import time
import logging
import logging.config

import logconfig
import time

nlp = spacy.load("en_core_web_sm")
app = Flask(__name__)

@app.route('/api/v1/I/insert_faiss',methods=['POST'])
def insert_post():
 
    print("inside insert ")
    return {'status':'success'}

if __name__ == '__main__':
    app.run(host='0.0.0.0',port='8002',debug=True)

usages gunicorn --worker-class gevent --bind 0.0.0.0:5001 faiss_wsgi:app

Updated Error stack trace

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/arbiter.py", line 583, in spawn_worker
    worker.init_process()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/ggevent.py", line 162, in init_process
    super().init_process()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base.py", line 119, in init_process
    self.load_wsgi()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base.py", line 144, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/wsgiapp.py", line 49, in load
    return self.load_wsgiapp()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/wsgiapp.py", line 39, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/util.py", line 358, in import_app
    mod = importlib.import_module(module)
  File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/tf/notebooks/NLP/dim_reduc/rec_sys/faiss_wsgi.py", line 5, in <module>
    from faiss_controller import app
  File "/tf/notebooks/NLP/dim_reduc/rec_sys/faiss_controller.py", line 1, in <module>
    import spacy
  File "/usr/local/lib/python3.6/dist-packages/spacy/__init__.py", line 10, in <module>
    from thinc.api import prefer_gpu, require_gpu, require_cpu  # noqa: F401
  File "/usr/local/lib/python3.6/dist-packages/thinc/api.py", line 2, in <module>
    from .initializers import normal_init, uniform_init, glorot_uniform_init, zero_init
  File "/usr/local/lib/python3.6/dist-packages/thinc/initializers.py", line 4, in <module>
    from .backends import Ops
  File "/usr/local/lib/python3.6/dist-packages/thinc/backends/__init__.py", line 6, in <module>
    from .ops import Ops
  File "/usr/local/lib/python3.6/dist-packages/thinc/backends/ops.py", line 10, in <module>
    from ..util import get_array_module, is_xp_array, to_numpy
  File "/usr/local/lib/python3.6/dist-packages/thinc/util.py", line 15, in <module>
    DATA_VALIDATION: ContextVar[bool] = ContextVar("DATA_VALIDATION", default=False)
TypeError: 'type' object is not subscriptable

suvkaka avatar Mar 06 '21 11:03 suvkaka

Thanks for the code sample. It works with Python 3.9 but I was able to reproduce the error with Python 3.6.9, so I guess it's Python version related.

We'll take a closer look at this, but you should be able to work around this by upgrading Python if that's an option.

polm avatar Mar 06 '21 13:03 polm

So what's happening is that when gunicorn loads ContextVar, the type annotation is failing for some reason. I am not sure why this happens.

First, Python 3.6 is a bit different because it has to use the backported context vars from here. So the implementation is not the same as the standard library version in 3.7+. But it should still support type annotations, and in a 3.6 Python shell ContextVar[bool] doesn't throw an error.

If it were just that one line at issue we could try removing the type annotation, but the notation is used in several places, so besides being undesirable that's not even an easy fix.

I also verified this code works if the gunicorn work class is sync instead of gevent.

I'll keep looking at this, but that's what I've found so far.

polm avatar Mar 09 '21 04:03 polm

Hi @polm any update? Facing the same issue. Any specified spacy and / or gevent version to work with? [Update 1] I have fixed this issue by installing the lowest supported version of gevent, i.e: pip install gevent==1.4. [Update 2] The workaround lets the program to run. But it gets frozen for unknown reason. [Update 3] The program was freezing because there were Python regular threads in my program. gevent doesn't behave well with native Python threads: [1] https://stackoverflow.com/q/26638345/6907424 [2] https://stackoverflow.com/q/20199242/6907424 [Update 4] Finally decided to use gthread worker which is basically a trade-off between sync and gevent: https://dev.to/lsena/gunicorn-worker-types-how-to-choose-the-right-one-4n2c

hafiz031 avatar Dec 05 '21 09:12 hafiz031

I spent some time on this but was unable to figure out what was going wrong.

While we'd be glad to accept a PR for this, since it only comes up with a certain combination of an external library and Python 3.6 I don't think it's a high priority bug for us, especially with Python 3.6 reaching EOL this month.

Thanks for the info on the workaround with gevent.

polm avatar Dec 05 '21 10:12 polm

We also just encountered the same issue and downgraded gevent to solve it. Hopefully future versions of spacy will take this into consideration, as it seems important when it comes to deployment in production.

lingvisa avatar Mar 22 '22 06:03 lingvisa

Thanks for the extra report.

It's worth noting that Python 3.6 has reached end of life at this point. Where you using 3.6 or an older version? Is upgrading your Python version not an option?

polm avatar Mar 22 '22 06:03 polm

Yes. That also could be an option, I suggested either upgrading python to 3.7, or downgrading gevent, and my colleague chose to do the 2nd, and it worked. Python 3.9 may be a little too higher for us, if it is required, but if the root cause is python version compatibility, this won't be a real issue any more.

lingvisa avatar Mar 22 '22 06:03 lingvisa

As noted upthread, this issue only happens in 3.6 (or, presumably, lower versions) because of the use of a backported library. The relevant code is part of the standard Python library in 3.7+, so if you can upgrade to that or higher it should be fine.

polm avatar Mar 22 '22 06:03 polm

That's good to know and won't be a concern any more. Thanks.

lingvisa avatar Mar 22 '22 06:03 lingvisa

@lingvisa Hello, I am having the same issue. I tried it with Python3.6.9 and Python3.8.0. Both have the same error. May I ask which version of gervent are you using?

richielo avatar Jun 07 '22 18:06 richielo

gevent==1.4 gunicorn==20.1.0

lingvisa avatar Jun 07 '22 18:06 lingvisa