ShiXinCheng
Results
1
issues of
ShiXinCheng
When loading FSDP checkpoints with world_size=1, the current implementation fails because the state dict reconstruction logic assumes sharded parameters. In the single‑process case, _flat_param handling is inconsistent, leading to mismatched...