TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10
TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 copied to clipboard
How to resume training from the lastest check point ?
I guess there should be some parameter to edit. Which one?
If you run the train script it automatically picks up the last checkpoint and resume training from there. Does that answer your question?
It doesn't resume training. It starts with Step 0 again. Can anyone resolve it?
I solved it by changing the fine_tune_checkpoint: "C:/tensorflow1/models/research/object_detection/faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt" in the config file in /training to the path to my last checkpoint Something like: fine_tune_checkpoint: "C:/tensorflow1/models/research/object_detection/training/model_45700.ckpt"
I solved it by changing the fine_tune_checkpoint: "C:/tensorflow1/models/research/object_detection/faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt" in the config file in /training to the path to my last checkpoint Something like: fine_tune_checkpoint: "C:/tensorflow1/models/research/object_detection/training/model_45700.ckpt"
With the new API the above fine_tune_checkpoint wont work, it has to be like this
fine_tune_checkpoint: "C:/tensorflow1/models/research/object_detection/training/model.ckpt-45700"
If you run the train script it automatically picks up the last checkpoint and resume training from there. Does that answer your question?
this works for me with tensorflow-gpu v1.12.0
it works for my env: tensorflow-gpu v1.12.0
programmers can tune the fine_tune_checkpoint value in
your config. file to the last stored model-ckpt-XXXXX(XXXXX means the steps for your training process.)
fine_tune_checkpoint: "voc/train_dir/model.ckpt-XXXXX"
Hi, can someone please confirm how can we resume the training process from the last checkpoint. I made the necessary changes in config file but no success
fine_tune_checkpoint: "C:/tensorflow1/models/research/object_detection/training/model.ckpt-70000"
After changing, my training gets resumed from the last checkpoint and then stops after 70001.
Use standard file APIs to check for files with this prefix. INFO:tensorflow:Restoring parameters from training/model.ckpt-70000 INFO:tensorflow:Restoring parameters from training/model.ckpt-70000 WARNING:tensorflow:From C:\Users\Yousaf\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\training\saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. INFO:tensorflow:Recording summary at step 70000. INFO:tensorflow:Recording summary at step 70000. INFO:tensorflow:global step 70001: loss = 0.2056 (31.675 sec/step) INFO:tensorflow:global step 70001: loss = 0.2056 (31.675 sec/step) INFO:tensorflow:Stopping Training. INFO:tensorflow:Stopping Training. INFO:tensorflow:Finished training! Saving model to disk. INFO:tensorflow:Finished training! Saving model to disk. WARNING:tensorflow:From C:\Users\Yousaf\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\training\saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to delete files with this prefix. WARNING:tensorflow:From C:\Users\Yousaf\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\training\saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to delete files with this prefix. C:\Users\Yousaf\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\summary\writer\writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened. warnings.warn("Attempting to use a closed FileWriter. "
Please confirm. Thank you
@yousaf-safdar Did you get any solution? My training doesn't even resume. It starts from 0 again even though I have given the new checkpoints