keras icon indicating copy to clipboard operation
keras copied to clipboard

BackupAndRestore callback does not work as in the example

Open N4321D opened this issue 1 year ago • 1 comments

I noticed that the backup and restore callback does not reduce the number of epochs that are being trained / restore the last trained epoch number. The example suggest that this should be the case link:

system info: OS: Pop Os 22.04 LTS keras version: 3.0.1 keras backend: tested with torch and tensorflow python version: 3.10.13 (in anaconda)

import keras
import numpy as np

class InterruptingCallback(keras.callbacks.Callback):
   def on_epoch_begin(self, epoch, logs=None):
     if epoch == 4:
       raise RuntimeError('Interrupting!')
callback = keras.callbacks.BackupAndRestore(backup_dir="/tmp/backup")
model = keras.models.Sequential([keras.layers.Dense(10)])
model.compile(keras.optimizers.SGD(), loss='mse')
try:
   model.fit(np.arange(100).reshape(5, 20), np.zeros(5), epochs=10,
             batch_size=1, callbacks=[callback, InterruptingCallback()],
             verbose=0)
except Exception as e:
   print(e)
history = model.fit(np.arange(100).reshape(5, 20), np.zeros(5),
                     epochs=10, batch_size=1, callbacks=[callback],
                     verbose=0)
len(history.history['loss'])

ouput:

Interrupting!
10  

It does 10 epochs in the last model.fit call even though the example suggest that it would remember the interruption at epoch 4 and continue from there.

N4321D avatar Feb 14 '24 21:02 N4321D

Thanks for reporting the issue.

I was able to replicate the issue here

sachinprasadhs avatar Feb 16 '24 00:02 sachinprasadhs

Thanks for the report -- I just fixed it. Please check that the build at HEAD works for your use case.

fchollet avatar Feb 22 '24 21:02 fchollet

image

Thank you it works!

N4321D avatar Feb 22 '24 22:02 N4321D

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar Feb 22 '24 23:02 google-ml-butler[bot]