accelerate Feature Request: Device mapping for models that aren't sharded

Feature Request: Device mapping for models that aren't sharded

Open JD-The-65th opened this issue 3 years ago • 2 comments

If it's possible, this feature would be nice for loading models that overload the CPU and GPU ram alone, but with device mapping the model will be split across both mediums, and even the hard drive if needed.

Jul 20 '22 19:07 JD-The-65th

Yes, that's exactly what Accelerate does for both sharded and non-sharded models. Not sure what feature you feel is missing, could you share some code?

Jul 21 '22 05:07 sgugger

I apologize for the delay, but this should capture all of my issues. I'm not sure if it's an issue with the model I'm using, but GPT-Neo barely touches GPU mem, but sharded OPT with the same parameter amount works just fine. Not sure if this is a bad example, but just take a look. https://colab.research.google.com/gist/JD-The-65th/3df3077443d48b2015b18c8ca9e0cc70/accelerate_opt.ipynb#scrollTo=Q0Zf_d5RhVpO

Jul 22 '22 07:07 JD-The-65th

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Aug 20 '22 15:08 github-actions[bot]

accelerate accelerate copied to clipboard

Feature Request: Device mapping for models that aren't sharded

accelerate
accelerate copied to clipboard