Ankit Kumar

Results 1 issues of Ankit Kumar

I have a codebase forked from torchtitan with minor changes. FSDP trains very well with minimal instability, but HSDP on the same codebase exhibits loss spikes. Is there some reason...

question
module: fsdp