yolov3-tf2 icon indicating copy to clipboard operation
yolov3-tf2 copied to clipboard

How to resume training

Open abhishekbu opened this issue 5 years ago • 4 comments

How do i resume training, what changes shiuld i make to continue training from where it left

abhishekbu avatar Jan 26 '20 18:01 abhishekbu

You can just start from the new checkpoint you generated from before. If you take a look on "train.py" you see that every epoch is generating a check point on /checkpoints folder. You can use it as a new weight to start your new train from.

molimat avatar Mar 03 '20 00:03 molimat

Hmmm, got the same question.

Already tried using the ModelCheckpoint weights, but I receive the following error when doing that:

ValueError: Shapes (255,) and (75,) are incompatible

  1. Trained the network with the original weights (yolov3.tf) for 10 epochs [colab_gpu.ipynb step 5]
  2. On epoch 10 'yolov3_train_10.tf' is created
  3. Restart the training, but instead of passing 'yolov3.tf', I gave it 'yolov3_train_10.tf'
  4. Throws above error..

Am I missing something?

Tomiha avatar Apr 21 '20 13:04 Tomiha

Hmmm, got the same question.

Already tried using the ModelCheckpoint weights, but I receive the following error when doing that:

ValueError: Shapes (255,) and (75,) are incompatible

  1. Trained the network with the original weights (yolov3.tf) for 10 epochs [colab_gpu.ipynb step 5]
  2. On epoch 10 'yolov3_train_10.tf' is created
  3. Restart the training, but instead of passing 'yolov3.tf', I gave it 'yolov3_train_10.tf'
  4. Throws above error..

Am I missing something?

@Tomiha Did you solve it?

Regards

amh28 avatar Jul 15 '20 03:07 amh28

Hi @amh28,

Yes, I got it working. If I remember correctly, there was an issue with the number of classes I passed through when training from previous weights.

Tomiha avatar Jul 16 '20 08:07 Tomiha