nobrainer icon indicating copy to clipboard operation
nobrainer copied to clipboard

Get checkpoint based on create time is probably not a good idea

Open hvgazula opened this issue 9 months ago • 6 comments

https://github.com/neuronets/nobrainer/blob/976691d685824fd4bba836498abea4184cffd798/nobrainer/processing/checkpoint.py#L57

What am I trying to do? Initialize from a previous checkpoint, to resume training over more epochs.

For example, the following snippet

try:
        bem = Segmentation.init_with_checkpoints(
        "unet",
        model_args=dict(batchnorm=True),
        checkpoint_filepath=checkpoint_filepath,
    )
except:
        bem = Segmentation(
            unet,
            model_args=dict(batchnorm=True),
            multi_gpu=True,
            checkpoint_filepath=checkpoint_filepath,
        )

should initialize from a checkpoint if the checkpoint_filepath exists. However, the getctime part conflicts with other folders created during training (could be predictions or other folders).

Solution:

  • Need a more robust way to look to checkpoints

hvgazula avatar May 10 '24 04:05 hvgazula