Will Constable

Results 116 comments of Will Constable

> Just a random thought. Do we need torch.pipelining.Schedule to do step 2 option2? The reason for this is that there are 2 common mappings- Loop and V. Loop: (e.g....

> During init_weights, if a rank does not participate the initialization of the current module (due to the module being deleted by PP), it simply does nothing. After each module...

oops forgot about this RFC. opened another one on pytorch side for RNG-state management for torch.pipelining. perhaps reassuringly, i rediscovered the same 2 options as in this proposal 😅 ....

closing this RFC now after landing various enhancements to DTensor RNG and updating TorchTitan's RNG configuration (#689). 1. Seed checkpoint is not deleted, but it is strictly optional now, should...

Thanks for posting this RFC! I want to see if we can make changes to existing torch.distributed apis first to solve some/all of your problems. And then if needed, we...

@youkaichao would you be happy to use the workflow @d4l3k proposed? or is there still something missing? @d4l3k is the PrefixStore needed such that each store can use a default...

@youkaichao is the only reason that send/recv do not work because of the dst/src mapping issue? I started to prototype a possible fix for that today, I'll share it here...