Anthony MOI
Anthony MOI
Thank you for this PR @alexeyr! I'll do my best to have a look in the near future!
Awesome! There's no rush so take the time you need! I'll add a few pieces of information that may help with the decision: - Every `Model` is different, and so...
Indeed, you need to specify the path during the build phase using `—baseHref=“/mongo”`
I think the reason was that `add_special_tokens` delegates to `add_tokens` for actually adding the tokens to the relevant maps/structures, and we need the special tokens to be added there. But...
Thank you for reporting this! See https://github.com/huggingface/tokenizers/issues/570 for the explanation about the differences. We should definitely remove the `end_of_word_suffix` option from the `WordPieceTrainer` as it makes absolutely no sense to...
Any specific reason for closing the issue? Did you manage to do what you wanted?
There's no easy way for now. This will be possible, as soon as we have https://github.com/huggingface/tokenizers/issues/15
There is no easy way at the moment. For tokenizers that use a BPE, you can probably do it manually in some cases, but you will need to dig into...
Indeed, we do not integrate with any downstream solution at the moment and let you do it, as your use-case might be completely different from others. Do you have any...
Thank you! As I was expected this method works for any `T` so the fact that `u32` has to be converted is true for a lot of different types. We...