scattertext
scattertext copied to clipboard
ngram features extractor using spacy
I needed to create scatertext plots with various ngrams length and so it is PR.
Hi Łukasz,
Thanks so much for the PR. It would be great to handle more than bigrams.
A few requests before I can merge this:
- Is it possible to eliminate the cytoolz dependency without incurring a substantial performance hit? I'm trying to keep the number of dependencies minimal.
- Could you please add some inline documentation explaining what the parameters to the various functions are and a small doctest-style example showing how the feature extractor is used.
- Could you please add in some unit tests in the test/ directory.
test_FeatsFromSpacyDoc.py
could be used as a partial model.
Appreciate your contribution!
Jason