Andrew Gu
Andrew Gu
I would really appreciate some pointers to the complicated initialization to learn more about it. And yes, I think that the seed checkpoint can be used to avoid the meta...
cc: @wanchaol @tianyu-l The above two pointers are good examples of real-model init methods that do not fit our current meta-device init flow. As far as I can tell, both...
@qiziAI Thanks for the PR! Could you provide some more details of the conflict for our understanding?
There was some past discussion on this (https://github.com/pytorch/torchtitan/pull/280).
Failure are all inductor-related, not FSDP2-related.
@pytorchbot merge -i
@pytorchbot rebase -s
@pytorchbot merge
@pytorchbot rebase -s
@pytorchbot merge