Chien-Chin Huang comments

Results 119 comments of


                                            Chien-Chin Huang

Support Gemma2 in torchtitan

@yzhangcs https://github.com/pytorch/pytorch/pull/148825 fixes the issue.

Support Gemma2 in torchtitan

@yzhangcs The PR is landed. You should be able to get it with the next nightly built PyTorch. Please let me know if that completely resolve the issue. I can...

Support Gemma2 in torchtitan

@yzhangcs I'll close the issue. Let me know if you still encounter issues.

[Question] CP and DP

@galalalala The configuration you suggested is not valid. With your proposed sharding strategy, a global batch (assuming batch size is 8 and sequence size 8192) need to be first sharded...

[Question] CP and DP

@galalalala Yes.

Add git SHA to wheel versions using versioningit

Thanks for the PR. I feel we can remove `.github/scripts/update_version.sh` and just use importlib.metadata to get the version. But this is orthogonal to this PR.

Add git SHA to wheel versions using versioningit

If one install TorchTitan package correctly, the version should be in metadata. And since this PR make version to be dynamic, it will automatically reflect the latest version with git...

modify save list for varlen attn

Since CI is currently not available, please refrain from landing the PR until we can have some CI signals.

[CheckpointServer] use streaming transfers

@d4l3k It seems that write_state_dict and read_state_dict won't work with DTensor. Please correct me if I'm wrong.