TextBlob
TextBlob copied to clipboard
TweetTokenizer in constructor raises ValueError
Despite the documentation here stating:
You can use other tokenizers, such as those provided by NLTK, by passing them into the TextBlob constructor then accessing the tokens property.
This fails:
from textblob import TextBlob
from nltk.tokenize import TweetTokenizer
blob = TextBlob("I don't work!", tokenizer=TweetTokenizer()) # Raises ValueError
However this works fine:
blob = TextBlob("I do work!")
blob.tokenize(TweetTokenizer())
# ==> ["I", "do", "work", "!"]
Demo of issue: https://repl.it/repls/SplendidVirtualModel
In textblob, only API group of tokenizer of nltk are supported. That's why you are getting the error. As a work around for your case, you can just keep using nltk tokenizer for tweets.
PR #325 should resolve it.