Adriane Boyd
Adriane Boyd
And I would switch this to `develop` but will wait since it's going to make the history look weird at the moment.
Thanks for the report, it's always interesting to see examples like this, even though in many cases there isn't any immediate action that we can take (see #3052). Some thoughts:...
In terms of usability, I think one option would be to add a `before_creation` callback for setting stop words from a JSON list.
I looked at all the examples in the docs for setting custom stop words and I think they should all continue to be fine. Nothing prevents you from providing custom...
The `before_creation` callback is kind of problematic because you don't want to reference any external data. But we can think about what to do...
No, because the stop words are part of the defaults that aren't serialized with the pipelines. (Because it's effectively an `is_stop` method rather than a plain list/set.) I'll have to...
Well, this reminded me why we're using `Pipe` here. It looks like mypy may support `hasattr` soon, so I think once that is supported, it would make sense to switch...
@robookwus: Welcome! This particular thread is for a pull request that's not related to the docs. Instead, please open a new issue (https://github.com/explosion/spaCy/issues/new?assignees=&labels=&template=02_docs.md) with more details about what you're interested...
There are a number of ways to approach this, so it's useful to hear from everyone who is interested in this! Can I ask some questions about what kind of...
If you want to treat proper nouns as nouns and you don't care whether you preserve the original POS tags in the doc, you can modify the POS before lemmatization...