Ita Zaporozhets
Ita Zaporozhets
Awesome, I'm glad it worked! Thanks for your patience 🤗
@DuyguA Sorry for the delay! Merged !! 🚀 Thanks for working on this 🤗
@ArthurZucker deprecate or set to False by default (currently it is set to True by default)? If we allow it to be set, then we do not deprecate?
Hi @terryyz , can you please also share the output of the prompts you are seeing to show the issue? Thanks!
@terryyz Sorry, I understood that `add_prefix_space=False` and `legacy=False` worked for `transformers` to fix the degradation? Can you please clarify what you meant / current issue I can look into? Thank...
Sorry I find it a bit hard to follow, which outputs are you looking for the EOS in?
The additional 'space' after the comma in the test `june 22 , 2018` _likely should_ be expected (ie, not stripped), since rag uses BART. test_tokenization_rag is minimal so it isn't...
Awesome! 💯
@ArthurZucker SigLip will be the first model to be able to test this (first model where we infer `PreTrainedTokenizerFast` without it being specified in the `tokenizer.json`), I can add the...
@ArthurZucker the test is here in the Silgip PR: https://github.com/huggingface/transformers/pull/29969/files#r1784784751 (copy pasted below) ```python # Model does not have a fast tokenizer or PreTrainedTokenizerFast specified in config but can still...