Devin Chotzen-Hartzell
Results
2
comments of
Devin Chotzen-Hartzell
yes I've tried it, we ran into some other issue regarding saving the optimizer step with dp_partitions >= 2. will file another bug for that when I have a chance...
Hi @deepakn94, which kinds of checkpoint resharding are meant to be supported for the torch_dist backend? I'm unable to load a (D, P, T) = (2, 2, 2) checkpoint into...