Ankit Kumar
Results
1
issues of
Ankit Kumar
I have a codebase forked from torchtitan with minor changes. FSDP trains very well with minimal instability, but HSDP on the same codebase exhibits loss spikes. Is there some reason...
question
module: fsdp