TextBlob icon indicating copy to clipboard operation
TextBlob copied to clipboard

TweetTokenizer in constructor raises ValueError

Open dparker2 opened this issue 6 years ago • 2 comments

Despite the documentation here stating:

You can use other tokenizers, such as those provided by NLTK, by passing them into the TextBlob constructor then accessing the tokens property.

This fails:

from textblob import TextBlob
from nltk.tokenize import TweetTokenizer

blob = TextBlob("I don't work!", tokenizer=TweetTokenizer())  # Raises ValueError

However this works fine:

blob = TextBlob("I do work!")
blob.tokenize(TweetTokenizer())
# ==> ["I", "do", "work", "!"]

Demo of issue: https://repl.it/repls/SplendidVirtualModel

dparker2 avatar Oct 12 '19 06:10 dparker2

In textblob, only API group of tokenizer of nltk are supported. That's why you are getting the error. As a work around for your case, you can just keep using nltk tokenizer for tweets.

paridhimnnit avatar Feb 19 '20 14:02 paridhimnnit

PR #325 should resolve it.

jschnurr avatar May 31 '20 17:05 jschnurr