scattertext icon indicating copy to clipboard operation
scattertext copied to clipboard

ngram features extractor using spacy

Open laugustyniak opened this issue 6 years ago • 1 comments

I needed to create scatertext plots with various ngrams length and so it is PR.

laugustyniak avatar Jan 25 '19 16:01 laugustyniak

Hi Łukasz,

Thanks so much for the PR. It would be great to handle more than bigrams.

A few requests before I can merge this:

  • Is it possible to eliminate the cytoolz dependency without incurring a substantial performance hit? I'm trying to keep the number of dependencies minimal.
  • Could you please add some inline documentation explaining what the parameters to the various functions are and a small doctest-style example showing how the feature extractor is used.
  • Could you please add in some unit tests in the test/ directory. test_FeatsFromSpacyDoc.py could be used as a partial model.

Appreciate your contribution!

Jason

JasonKessler avatar Jan 28 '19 08:01 JasonKessler