Arthur
Arthur
For 1. yes a warning should indeed be issued sorry, we raise error for mismatch sizes!
Indeed opening again!
Do you want to open a PR to propagate the changes we made to Llama and gemma?
#30642 will fix this ! 🤗
@sanchit-gandhi I think you tested it for `gemma` which is based on the `Llama` one and we had correct performances no?
Hey @dfdx we are not actively working on this, opening this to the community in case some of community magicians figure it out! 🤗
Thanks for your efforts! merging 🥳
Hey @antoinethl . Sorry for the delay, when you tried with the older version of transformers, are you sure that the `decoder_input_ids` were not just 2 tokens ? This could...
cc @NielsRogge and @younesbelkada if one of you want to review on @jpizarrom makes the CIs go green!