transformers
transformers copied to clipboard
feat: support indivisible shards for TP model loading and TPlizing.
What does this PR do?
Fixes https://github.com/huggingface/transformers/issues/37051
Approach is to support uneven sharding and seek segments of data that mimics torch.chunk since torch.chunk is the style of sharding adopted by torch Shard placements API for both even and uneven sharding. Finally we pass stride and shape to from_local to allow for uneven sharding.
from transformers import AutoModelForCausalLM
import torch
m2 = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0", tp_plan=None)
m = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0", tp_plan="auto")
print(m.model.layers[0].self_attn.q_proj.weight)
print(m.model.layers[0].self_attn.q_proj.weight.shape)
ft = m.model.layers[0].self_attn.q_proj.weight.full_tensor().to("cpu")
assert torch.equal(ft, m2.model.layers[0].self_attn.q_proj.weight.to("cpu"))
# assert should pass
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. https://github.com/huggingface/transformers/issues/37051
- [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
@ArthurZucker @SunMarc @muellerzr