lightning-hydra-template icon indicating copy to clipboard operation
lightning-hydra-template copied to clipboard

Add ability to resume training from latest checkpoint without specifying path

Open ashleve opened this issue 5 years ago • 2 comments

Add some kind of method to recursively go over everything in logs/, and find the latest saved checkpoint (find by date saved). Add config flag for resuming training from the latest checkpoint:

resume_latest: True

Useful when we want to quickly resume our latest run without specifying ckpt path.

Should be added as an enhancement to utils.extras().

Could also automatically override the whole config with the correct one from .hydra folder.

ashleve avatar Mar 29 '21 14:03 ashleve

@ashleve This is cool but what if your artifacts are stored in wandb?

turian avatar Sep 12 '22 14:09 turian

@turian Checkpoints are always available in output dirs. Enabling uploading them as artifacts in wandb logger doesn't change that.

If you really need it despite of that, perhaps you could write a function that retrieves latest wandb checkpoint from current project through their API.

Supporting individual logger use cases is out of the scope of this template though, so I'm not planning on introducing anything like that.

ashleve avatar Sep 12 '22 18:09 ashleve