Stop-resume training on object detection
Hi there,
Is there anyway to resume model training on object detection that has been interrupted? I tried to use google colab for model training but always stopped midway.
Bump, I have the same question!
As far as I know you can't exactly stop-resume your training but you can continue training using your already pretrained model. For example this in this code:
trainer.setTrainConfig(object_names_array=["obj_class1"], batch_size=4,
num_experiments=150,
train_from_pretrained_model=YOUR_PRETRAINED_MODEL_PATH)
You just need to change YOUR_PRETRAINED_MODEL_PATH to your most recent model. In this way you can also continue training same model when you acquire new dataset etc.
Or if you lose your model files in Google Colab after 12h which is normal behaviour you should consider saving your models to Google Drive instead
Hi, I am trying to continue the training using a pretrained model but I am getting the following error:
ValueError: Layer #249 (named "conv_81"), weight <tf.Variable 'conv_81/kernel:0' shape=(1, 1, 1024, 33) dtype=float32> has shape (1, 1, 1024, 33), but the saved weight has shape (27, 1024, 1, 1).
My code looks like this:
trainer = DetectionModelTrainer()
trainer.setModelTypeAsYOLOv3()
trainer.setDataDirectory(data_directory='dataset')
trainer.setTrainConfig(object_names_array=["class2", "class2", "class3", "class4", "class5", "class6"],
batch_size=4, num_experiments=10, train_from_pretrained_model="model.h5")
trainer.trainModel()
The file model.h5 is the best version of the first training session. I have included two new classes this time, could this be the problem?
As far as I know you can't exactly stop-resume your training but you can continue training using your already pretrained model. For example this in this code:
trainer.setTrainConfig(object_names_array=["obj_class1"], batch_size=4, num_experiments=150, train_from_pretrained_model=YOUR_PRETRAINED_MODEL_PATH)You just need to change YOUR_PRETRAINED_MODEL_PATH to your most recent model. In this way you can also continue training same model when you acquire new dataset etc.
Or if you lose your model files in Google Colab after 12h which is normal behaviour you should consider saving your models to Google Drive instead
Hey! thanks for this comment, what about the number of experiments? shall we reduce it proportionally to the number of the most recent model? thanks!