nltk icon indicating copy to clipboard operation
nltk copied to clipboard

using nltk.txt for downloading stopwords

Open ngingihy opened this issue 1 year ago • 3 comments

I'm deploying my flask app. I have 'nltk' in 'requirements.txt' and same root directory have 'ntlk.txt' and only have 'stopwords' in it.

And in my main.py file i have the nltk.download('stopwords'). The issue is while deployment using ci/cd i keep getting this error.

  -----> Downloading NLTK corpora...
   -----> Downloading NLTK packages: stopwords
   /tmp/contents1601798070/deps/0/python/lib/python3.9/runpy.py:127: RuntimeWarning: 'nltk.downloader' found in sys.modules after import of package 'nltk', but prior to execution of 'nltk.downloader'; this may result in unpredictable behaviour
   warn(RuntimeWarning(msg))
   [nltk_data] Error loading stopwords: <urlopen error [Errno 104]
   [nltk_data]     Connection reset by peer>
   Error installing package. Retry? [n/y/e]
   Traceback (most recent call last):
   File "/tmp/contents1601798070/deps/0/python/lib/python3.9/runpy.py", line 197, in _run_module_as_main
   return _run_code(code, main_globals, None,
   File "/tmp/contents1601798070/deps/0/python/lib/python3.9/runpy.py", line 87, in _run_code
   exec(code, run_globals)
   File "/tmp/contents1601798070/deps/0/python/lib/python3.9/site-packages/nltk/downloader.py", line 2544, in <module>
   rv = downloader.download(
   File "/tmp/contents1601798070/deps/0/python/lib/python3.9/site-packages/nltk/downloader.py", line 788, in download
   choice = input().strip()
   EOFError: EOF when reading a line
   **ERROR** Could not download NLTK Corpora: exit status 1
BuildpackCompileFailed - App staging failed in the buildpack compile phase

I tried changing EOL to linux using notepad++ but still getting the same exact error

ngingihy avatar Apr 19 '23 01:04 ngingihy

NLTK doesn't download the stopwords from a nltk.txt file. To the best of my knowledge, such a file does nothing. However, I can't tell why you're getting a "Connection reset by peer" error when trying to run nltk.download("stopwords"). Perhaps you can run the downloading via the terminal instead, i.e. python -m nltk.downloader stopwords

tomaarsen avatar Apr 19 '23 05:04 tomaarsen

I'm installing the libraries using pip install requirements.txt through automated jobs

ngingihy avatar Apr 26 '23 13:04 ngingihy

Other users have had similar issues recently. Perhaps some of the solutions there apply to your situation as well: #3146.

tomaarsen avatar Apr 26 '23 13:04 tomaarsen