metaseq icon indicating copy to clipboard operation
metaseq copied to clipboard

Have trainer.load_checkpoint accept NoneType checkpoint path string

Open KUNAL1612 opened this issue 2 years ago • 0 comments

🚀 Feature Request

In checkpoint_utils.py, in line 312, if checkpoint_path_to_load needs to be a valid string and can't be None. However if the string points to an invalid file, then extra state is None. In line with the work done in #158 (which has now been rolled back but presumably will be reinstated at some point), it is sometimes possible that the checkpoint file is invalid/incorrect so we might manually want to pass None as the checkpoint_path_to_load.

The exact error when passing a NoneType object is:

TypeError: expected str, bytes or os.PathLike object, not NoneType

Pitch

One of two things can be done to address this

  • ensure that the filename can be None in trainer.load_checkpoint (L405)
  • in checkpoint_utils.load_checkpoint, if the checkpoint_path_load is None, skip the logic to generate extra_state and just manually set that to None, so that training can begin from scratch

Importance

While if no checkpoint is passed, the checkpoint_path_load string is a default filename, this does not make much sense and I think in general it would make more sense if this were None for when the checkpoint did not exist, or existed and was corrupted.

KUNAL1612 avatar Aug 01 '22 20:08 KUNAL1612