thinc
thinc copied to clipboard
Compatibility issue with gunicorn
I am trying to import spacy from gunicorn with worker gevent and Flask
File "/tf/notebooks/NLP//model_spacy.py", line 1, in
I used the sample application from the gunicorn docs and imported spaCy at the top and it worked without issue. Could you give example code to reproduce your issue and clarify what version of spaCy, thinc, and Python you're using?
My code:
import spacy
def app(environ, start_response):
"""Simplest possible application object"""
data = b'Hello, World!\n'
status = '200 OK'
response_headers = [
('Content-type', 'text/plain'),
('Content-Length', str(len(data)))
]
start_response(status, response_headers)
return iter([data])
Hi, I am using gunicorn with flask, my apologies for not mentioning flask earlier.
Note:
- the same code works perfectly when I run using Flask dev server i.e 'python faiss_controller.py'
- but the error comes if I run with gunicorn i.e 'gunicorn --worker-class gevent --bind 0.0.0.0:5001 faiss_wsgi:app'
Following are the versions. Python==3.6.9 spacy==3.0.3 spacy-legacy==3.0.1 gunicorn==20.0.4 Flask==1.1.2 Flask-RESTful==0.3.8 gevent==21.1.2 thinc==8.0.1
I am using the following two .py files
- faiss_wsgi.py
- faiss_controller.py
Here is the code
faiss_wsgi.py
from faiss_controller import app
if __name__ == "__main__":
app.run()
faiss_controller.py
import spacy
import json
from flask import Flask
from flask import request,g
from flask_restful import reqparse
import time
import logging
import logging.config
import logconfig
import time
nlp = spacy.load("en_core_web_sm")
app = Flask(__name__)
@app.route('/api/v1/I/insert_faiss',methods=['POST'])
def insert_post():
print("inside insert ")
return {'status':'success'}
if __name__ == '__main__':
app.run(host='0.0.0.0',port='8002',debug=True)
usages gunicorn --worker-class gevent --bind 0.0.0.0:5001 faiss_wsgi:app
Updated Error stack trace
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/ggevent.py", line 162, in init_process
super().init_process()
File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base.py", line 119, in init_process
self.load_wsgi()
File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base.py", line 144, in load_wsgi
self.wsgi = self.app.wsgi()
File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/wsgiapp.py", line 49, in load
return self.load_wsgiapp()
File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/wsgiapp.py", line 39, in load_wsgiapp
return util.import_app(self.app_uri)
File "/usr/local/lib/python3.6/dist-packages/gunicorn/util.py", line 358, in import_app
mod = importlib.import_module(module)
File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/tf/notebooks/NLP/dim_reduc/rec_sys/faiss_wsgi.py", line 5, in <module>
from faiss_controller import app
File "/tf/notebooks/NLP/dim_reduc/rec_sys/faiss_controller.py", line 1, in <module>
import spacy
File "/usr/local/lib/python3.6/dist-packages/spacy/__init__.py", line 10, in <module>
from thinc.api import prefer_gpu, require_gpu, require_cpu # noqa: F401
File "/usr/local/lib/python3.6/dist-packages/thinc/api.py", line 2, in <module>
from .initializers import normal_init, uniform_init, glorot_uniform_init, zero_init
File "/usr/local/lib/python3.6/dist-packages/thinc/initializers.py", line 4, in <module>
from .backends import Ops
File "/usr/local/lib/python3.6/dist-packages/thinc/backends/__init__.py", line 6, in <module>
from .ops import Ops
File "/usr/local/lib/python3.6/dist-packages/thinc/backends/ops.py", line 10, in <module>
from ..util import get_array_module, is_xp_array, to_numpy
File "/usr/local/lib/python3.6/dist-packages/thinc/util.py", line 15, in <module>
DATA_VALIDATION: ContextVar[bool] = ContextVar("DATA_VALIDATION", default=False)
TypeError: 'type' object is not subscriptable
Thanks for the code sample. It works with Python 3.9 but I was able to reproduce the error with Python 3.6.9, so I guess it's Python version related.
We'll take a closer look at this, but you should be able to work around this by upgrading Python if that's an option.
So what's happening is that when gunicorn loads ContextVar, the type annotation is failing for some reason. I am not sure why this happens.
First, Python 3.6 is a bit different because it has to use the backported context vars from here. So the implementation is not the same as the standard library version in 3.7+. But it should still support type annotations, and in a 3.6 Python shell ContextVar[bool] doesn't throw an error.
If it were just that one line at issue we could try removing the type annotation, but the notation is used in several places, so besides being undesirable that's not even an easy fix.
I also verified this code works if the gunicorn work class is sync instead of gevent.
I'll keep looking at this, but that's what I've found so far.
Hi @polm any update? Facing the same issue. Any specified spacy and / or gevent version to work with?
[Update 1] I have fixed this issue by installing the lowest supported version of gevent, i.e: pip install gevent==1.4.
[Update 2] The workaround lets the program to run. But it gets frozen for unknown reason.
[Update 3] The program was freezing because there were Python regular threads in my program. gevent doesn't behave well with native Python threads:
[1] https://stackoverflow.com/q/26638345/6907424
[2] https://stackoverflow.com/q/20199242/6907424
[Update 4] Finally decided to use gthread worker which is basically a trade-off between sync and gevent: https://dev.to/lsena/gunicorn-worker-types-how-to-choose-the-right-one-4n2c
I spent some time on this but was unable to figure out what was going wrong.
While we'd be glad to accept a PR for this, since it only comes up with a certain combination of an external library and Python 3.6 I don't think it's a high priority bug for us, especially with Python 3.6 reaching EOL this month.
Thanks for the info on the workaround with gevent.
We also just encountered the same issue and downgraded gevent to solve it. Hopefully future versions of spacy will take this into consideration, as it seems important when it comes to deployment in production.
Thanks for the extra report.
It's worth noting that Python 3.6 has reached end of life at this point. Where you using 3.6 or an older version? Is upgrading your Python version not an option?
Yes. That also could be an option, I suggested either upgrading python to 3.7, or downgrading gevent, and my colleague chose to do the 2nd, and it worked. Python 3.9 may be a little too higher for us, if it is required, but if the root cause is python version compatibility, this won't be a real issue any more.
As noted upthread, this issue only happens in 3.6 (or, presumably, lower versions) because of the use of a backported library. The relevant code is part of the standard Python library in 3.7+, so if you can upgrade to that or higher it should be fine.
That's good to know and won't be a concern any more. Thanks.
@lingvisa Hello, I am having the same issue. I tried it with Python3.6.9 and Python3.8.0. Both have the same error. May I ask which version of gervent are you using?
gevent==1.4 gunicorn==20.1.0