Adriane Boyd
Adriane Boyd
Thanks, the example is very helpful! We will look into it...
Just a note that editing this setting in `config.cfg` for a trained pipeline won't change anything because these settings are only used on initialization. It will work if you're training...
Sure, a PR would be welcome! The functions would go in this section: https://spacy.io/api/top-level#gold. The source is in `website/docs/api/top-level.md`. Don't be concerned if you can't get the website dev mode...
Sorry this didn't work as expected, and thanks for the suggestion! This issue is kind of low priority on our end right now, but we'll try to come back to...
Hi, it does look like there might be a rule for `e -> er` that's missing from the French lemmatizer rules: https://github.com/explosion/spacy-lookups-data/blob/544a965501f06f55349e7402e80d6a49bc4cb3cd/spacy_lookups_data/data/fr_lemma_rules.json#L79-L125 My French is not that great, so I'm...
There is a lemmatizer cache that would cause this behavior. You can clear it (just by hand: `nlp.get_pipe("lemmatizer").cache = {}`) or save and reload the pipeline.
Sure, if you'd like to open a PR, please go ahead! We mainly test the lookup lemmatizers in the tests in that repo because we don't want to have to...
The rule-based lemmatizer does have a mechanism for checking for forms like infinitives that are already lemmas and don't need to be processed further. There's not currently a check for...
That's a good point! We took the lemma exceptions out of the tokenizer (so the tokenizer is only dealing with tokenization) without moving them to a new component. We can...
Yes, the `attribute_ruler` is the right place to add these exceptions in v3. We will need to add these exceptions to the `attribute_ruler` when we configure the pretrained pipelines for...