Sylvain Gugger
Sylvain Gugger
cc @muellerzr
You cannot send your model to `accelerator.prepare` if using `device_map="auto"` (as the model will be split across GPUs already).
The snippet of code above does not match what you are telling me. Could you please share something I can reproduce?
cc @muellerzr
I don't see why not, if you want to make a PR adding support :-) Initially I didn't include it because users can always do the init on their own...
The `decapoda-research/llama-7b-hf` checkpoints cannot work with Hugging Face anyway, they are not maintained by their owner.
I'm not sure I understand why you think this is a bug. Your layer needs all weights to be on the same device for the cat operation, so should be...
cc @muellerzr
cc @muellerzr