python train.py --dataset ./data/train.tfrecord --val_dataset ./data/_val.tfrecord --classes ./data/coco2.names --num_classes 1 --mode fit --transfer darknet --batch_size 16 --epochs 3 --weights ./checkpoints/yolov3.tf --weights_num_classes 1
Not able to retrain the model with custom dataset. While running the model I am getting the following error.
2020-05-26 00:14:43.973851: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-05-26 00:14:43.977913: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4702 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci b
us id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "train.py", line 193, in
app.run(main)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\absl\app.py", line 299, in run
_run_main(main, args)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\absl\app.py", line 250, in _run_main
sys.exit(main(argv))
File "train.py", line 97, in main
model_pretrained.load_weights(FLAGS.weights)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 234, in load_weights
return super(Model, self).load_weights(filepath, by_name, skip_mismatch)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\keras\engine\network.py", line 1193, in load_weights
status = self._trackable_saver.restore(filepath)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\training\tracking\util.py", line 1283, in restore
checkpoint=checkpoint, proto_id=0).restore(self._graph_view.root)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\training\tracking\base.py", line 209, in restore
restore_ops = trackable._restore_from_checkpoint_position(self) # pylint: disable=protected-access
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\training\tracking\base.py", line 908, in _restore_from_checkpoint_position
tensor_saveables, python_saveables))
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\training\tracking\util.py", line 289, in restore_saveables
validated_saveables).restore(self.save_path_tensor)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\training\saving\functional_saver.py", line 255, in restore
restore_ops.update(saver.restore(file_prefix))
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\training\saving\functional_saver.py", line 102, in restore
restored_tensors, restored_shapes=None)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\training\saving\saveable_object_util.py", line 116, in restore
self.handle_op, self._var_shape, restored_tensor)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\ops\resource_variable_ops.py", line 297, in shape_safe_assign_variable_handle
shape.assert_is_compatible_with(value_tensor.shape)
File "D:\Anaconda3\envs\yolov3-tf2-gpu\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 1110, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (18,) and (255,) are incompatible
WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-8
W0526 00:14:52.688719 4204 util.py:144] Unresolved object in checkpoint: (root).layer-8
WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-9
W0526 00:14:52.688719 4204 util.py:144] Unresolved object in checkpoint: (root).layer-9
WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-10
W0526 00:14:52.689668 4204 util.py:144] Unresolved object in checkpoint: (root).layer-10
WARNING:tensorflow:Unresolved object in checkpoint: (root).layer-11
W0526 00:14:52.689668 4204 util.py:144] Unresolved object in checkpoint: (root).layer-11
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.
g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
W0526 00:14:52.689668 4204 util.py:152] A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the
load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
same with u bro, so do u solve it these days?
I had the same error and I found that I was providing the wrong number of classes when I was training
I'm facing this issue too. Does anyone already know the solution?
I dont know exactly why but my problem was solved after changing --transfer fine_tune
to darknet
.
I dont know exactly why but my problem was solved after changing --transfer fine_tune
to darknet
.
This solved the problem , thanks a bunch