Nicolas Patry

Results 978 comments of Nicolas Patry

I didn't but we're still missing some tests though. I offered to help writing them if you want, but we really want tests to show case this feature (and make...

No I pinged him because the code was clean enough to be looked at, but the tests (especially for such a complex feature) are mandatory. If you can get started...

@sgugger At least reformer and camembert are concerned (some test failed when writing bogus code here.)

IT's breaking and breaking reformer and xlnet, I removed the breaking part of it for Llama.

Hi @SergeyShk . This is linked to how tokenizers work, and there's nothing to be done about it (there's no difference for it if there was a space or not,...

Legacy. This was created before using `tokenizers` library, and therefore `offsets` where not even an option. So indexing back was not possible. Since we're keen to never break compatiblity (until...

> word is also always in lower case This depends on the tokenizer.

> cc @Narsil if I am missing something (maybe the normalizers in rust should support identity type) `normalizer: None` should do nothing. Most likely a case not handled by our...

The current modification LGTM. I'm not sure why the test fail, maybe rebase ?

Hi @timxieICN , Thanks for the suggestion. In general, sentence-similarity like https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 are served by `SentenceTransformers` which is a library on top of `transformers` itself. https://huggingface.co/sentence-transformers Sentence transformers adds a...