tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

Feedback about Asynchronous Saving with Distributed Checkpoint (DCP)

Open lebrice opened this issue 3 months ago • 2 comments

Hey there! Little nitpick about the last block of this docs page: https://docs.pytorch.org/tutorials/recipes/distributed_async_checkpoint_recipe.html

The checkpoint_future variable is never written to in the last block. Perhaps the intent was to have this instead?

checkpoint_future = dcp.async_save(state_dict, storage_writer=writer, checkpoint_id=f"{CHECKPOINT_DIR}_step{step}")

cc @LucasLLC @MeetVadakkanchery @mhorowitz @pradeepfn @ekr0 @haochengsong @Saiteja64

lebrice avatar Sep 23 '25 13:09 lebrice