TextBlob icon indicating copy to clipboard operation
TextBlob copied to clipboard

TextBlob ngrams removes some symbols

Open PasaOpasen opened this issue 4 years ago • 1 comments

TextBlob('c# c++ r').ngrams(2)
# [WordList(['c', 'c']), WordList(['c', 'r'])]

PasaOpasen avatar Jun 27 '20 19:06 PasaOpasen

That is because the ngrams() function calls a Wordlist() object which itself calls the words() function. In the source we can see there is a parameter a this level called 'include_punc' setted at False by default. May be if this parameter should be accessed at the ngram() function level to keep the symbols.

leo-p-labs avatar Sep 09 '21 12:09 leo-p-labs