Sahaj Agarwal
Results
1
comments of
Sahaj Agarwal
@Charles-ux-bit Assuming 30B-IML checkpoint is same as 30B checkpoint in architecture, a hack that works is to copy "shard_metadata" from the later and then use reshard_fsdp script