Error running example.py
On: 31/08/2017 -2.04
CARD PAYMENT TO SHELL TOTHILL,2.04 GBP, RATE 1.00/GBP ON 29-08-2013
My guess is:
> 6
Traceback (most recent call last):
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/decorators.py", line 35, in decorated
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/tokenizers.py", line 59, in tokenize
return nltk.tokenize.sent_tokenize(text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/nltk/tokenize/__init__.py", line 106, in sent_tokenize
tokenizer = load(f"tokenizers/punkt/{language}.pickle")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/nltk/data.py", line 750, in load
opened_resource = _open(resource_url)
^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/nltk/data.py", line 876, in _open
return find(path_, path + [""]).open()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/nltk/data.py", line 583, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource punkt not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('punkt')
For more information see: https://www.nltk.org/data.html
Attempted to load tokenizers/punkt/PY3/english.pickle
Searched in:
- '/Users/arturo/nltk_data'
- '/Users/arturo/Documents/GitHub/BankClassify/.venv/nltk_data'
- '/Users/arturo/Documents/GitHub/BankClassify/.venv/share/nltk_data'
- '/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- ''
**********************************************************************
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/arturo/Documents/GitHub/BankClassify/example.py", line 5, in <module>
bc.add_data("Statement_Example.txt")
File "/Users/arturo/Documents/GitHub/BankClassify/BankClassify.py", line 58, in add_data
self._ask_with_guess(self.new_data)
File "/Users/arturo/Documents/GitHub/BankClassify/BankClassify.py", line 154, in _ask_with_guess
self.classifier.update([(stripped_text, category) ])
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/classifiers.py", line 292, in update
self._word_set.update(_get_words_from_dataset(new_data))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/classifiers.py", line 64, in _get_words_from_dataset
return set(all_words)
^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/classifiers.py", line 63, in <genexpr>
all_words = chain.from_iterable(tokenize(words) for words, _ in dataset)
^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/classifiers.py", line 59, in tokenize
return word_tokenize(words, include_punc=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/tokenizers.py", line 76, in word_tokenize
for sentence in sent_tokenize(text)
^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/base.py", line 67, in itokenize
return (t for t in self.tokenize(text, *args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/arturo/Documents/GitHub/BankClassify/.venv/lib/python3.12/site-packages/textblob/decorators.py", line 37, in decorated
raise MissingCorpusError() from error
textblob.exceptions.MissingCorpusError:
Looks like you are missing some required data for this feature.
To download the necessary data, simply run
python -m textblob.download_corpora
or use the NLTK downloader to download the missing data: http://nltk.org/data.html
If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues.
Process finished with exit code 1
I ran python -m textblob.download_corpora but still received the above error
I have the same error
Same error.. Wondering if there's something I could be missing... I'm trying to run the example from a virtual environment though......
Noticed it's been a while (about 6 months before time of writing -> https://github.com/sloria/TextBlob/commit/c27324d9986fdfa56d4337c3bce952f2b057ceb4) since there were changes to the repo and I could roughly say that this project isn't really maintained anymore or the author(s) / contributor(s) haven't exactly had the time to address some issues as of late.
Still hope to hear from them whenever someone's available.
I have the same error too:
sample_text = "I love data science and machine learning. I love coding. I love data science and coding."
TextBlob(sample_text).ngrams(3) # 3-gram
LookupError Traceback (most recent call last)
File c:\Users\tokel\anaconda3\Lib\site-packages\textblob\decorators.py:35, in requires_nltk_corpus.<locals>.decorated(*args, **kwargs)
[34](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/decorators.py:34) try:
---> [35](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/decorators.py:35) return func(*args, **kwargs)
[36](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/decorators.py:36) except LookupError as error:
File c:\Users\tokel\anaconda3\Lib\site-packages\textblob\tokenizers.py:59, in SentenceTokenizer.tokenize(self, text)
[58](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/tokenizers.py:58) """Return a list of sentences."""
---> [59](file:///C:/Users/tokel/anaconda3/Lib/site-packages/textblob/tokenizers.py:59) return nltk.tokenize.sent_tokenize(text)
File c:\Users\tokel\anaconda3\Lib\site-packages\nltk\tokenize\__init__.py:119, in sent_tokenize(text, language)
[110](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:110) """
[111](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:111) Return a sentence-tokenized copy of *text*,
[112](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:112) using NLTK's recommended sentence tokenizer
(...)
[117](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:117) :param language: the model name in the Punkt corpus
[118](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:118) """
--> [119](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:119) tokenizer = _get_punkt_tokenizer(language)
[120](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:120) return tokenizer.tokenize(text)
File c:\Users\tokel\anaconda3\Lib\site-packages\nltk\tokenize\__init__.py:105, in _get_punkt_tokenizer(language)
[98](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:98) """
[99](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:99) A constructor for the PunktTokenizer that utilizes
[100](file:///C:/Users/tokel/anaconda3/Lib/site-packages/nltk/tokenize/__init__.py:100) a lru cache for performance.
...
python -m textblob.download_corpora
or use the NLTK downloader to download the missing data: http://nltk.org/data.html
If this doesn't fix the problem, file an issue at https://github.com/sloria/TextBlob/issues.
please upgrade to 0.19.0 and rerun the download_corpora script