Cyril Vallez
Cyril Vallez
Hey! Super super sorry, I missed the ping! Indeed, model sharing the last part of their names can introduce issues. I think I have an idea that should fix it,...
Hey @sbucaille @konstantinos-p! It will be solved by https://github.com/huggingface/transformers/pull/37829 🤗 I will merge asap! Sorry for the wait on this! EDIT: Just merged the PR!
Hey @sbucaille! Just opened https://github.com/huggingface/transformers/pull/42844 to fix the renaming issue. Let me know if that works!
cc @molbap @zucchini-nlp, feel free to tag me when you believe this is ready for final review!
Yes, see my answer here https://github.com/huggingface/transformers/pull/36895#issuecomment-2815598690!
Sorry about that! Could you rebase/merge with latest main and try again? Tied_weights have been fixed earlier today!
If weights don't need to be tied, we also need to remove all associated comments/methods!
Hey! The issue is not about the `tp_device` and cuda/other accelerator, it's the fact that we are setting the `data` to the buffer itself! So the best will be to...
Humm, I tested both with mps (mac) and amd gpus hardware, and it works in both cases... Both with `a = a.data.to()` and `a.data = a.data.to()`. Do you have more...
> All tests passed other than `tests/utils/test_modeling_utils.py::ModelUtilsTest::test_generation_config_is_loaded_with_model`, unrelated to adding this model. > > Please review this PR again? And could you tell me how to fix the error? to:...