Paul O'Leary McCann

Results 233 comments of Paul O'Leary McCann

@xavierfontaine It looks like this may be an issue with the tokenizer config. In the directory where the pipeline is saved locally, at the top, there should be a `config.cfg`...

Thanks for the report, that does seem to not be working. Note that the backends of the public demos haven't been updated in a while, so they're still running the...

Checking this again it still seems to be an issue - it looks like the LENGTH matches specifically usually, but not always, cause some kind of server-side error.

Checking this again, I am able to reproduce this. The way it works is a little subtle, though I think you mentioned it above. 1. It's fine if you click...

Minimal reproduction code for this issue: ``` import spacy from spacy.language import Language @Language.component("special_split") def special_split(doc): # Note this code is not robust with handling indices for ii, tok in...

Using the 3.0 model with 3.1 the wrong sentence splits still show up so it looks like the "fix" with 3.1 is a happy accident. Also note these tests are...

I think just documenting this is fine, since if someone is using arguments like this they're digging pretty deep into the system and can accept a wrinkle. As another option,...

Thanks for the suggestion! I think we'd be happy to take a PR for that.

Thanks for the report, we'll take a look at that! To be clear, this is a model that you trained in 3.1.2 initially, and then resumed training on? Or did...