Yolov3_tiny-Hardhat-detection_Tensorflow icon indicating copy to clipboard operation
Yolov3_tiny-Hardhat-detection_Tensorflow copied to clipboard

训练问题

Open MC1016 opened this issue 5 years ago • 27 comments

博主您好,打扰您了,关于模型训练,我个人的数据集总数跑不通,有些问题想跟您咨询,不知您什么时候方便

MC1016 avatar Feb 21 '20 03:02 MC1016

Hey, write to me in English. My Chinese is weak.

rashidch avatar Feb 22 '20 08:02 rashidch

OK when i run the train.py file to train my own data. it said OutOfRangeError (see above for traceback): End of sequence. i cant handle this error.i Have you ever faed this problem?

MC1016 avatar Feb 22 '20 13:02 MC1016

I did not face the problem before. Can you show me your train.txt and test.txt?

rashidch avatar Feb 22 '20 14:02 rashidch

place your train.txt and test.txt files like this.

image path and name, classID, xl, yl, xr, yr, classID, xl, yl, xr, yr ...

rashidch avatar Feb 22 '20 14:02 rashidch

the train.txt like this C:\Users\Dell\Desktop\test\YOLOv3_tiny_TensorFlow-master/VOC2007/JPEGImages/part2_000094.jpg 94,2,176,116,0 C:\Users\Dell\Desktop\test\YOLOv3_tiny_TensorFlow-master/VOC2007/JPEGImages/000937.jpg 188,61,258,150,0 390,72,461,157,0 C:\Users\Dell\Desktop\test\YOLOv3_tiny_TensorFlow-master/VOC2007/JPEGImages/part2_001002.jpg 376,22,795,543,0 the test.txt C:\Users\Dell\Desktop\test\YOLOv3_tiny_TensorFlow-master/VOC2007/JPEGImages/000002.jpg 37,32,76,84,0 165,103,208,158,0 178,71,213,113,0 221,44,251,88,0 249,61,283,112,0 335,60,376,112,0 344,107,385,163,0 372,59,402,110,0 409,77,454,136,0 9,75,46,124,1 C:\Users\Dell\Desktop\test\YOLOv3_tiny_TensorFlow-master/VOC2007/JPEGImages/000005.jpg 378,109,458,179,0 C:\Users\Dell\Desktop\test\YOLOv3_tiny_TensorFlow-master/VOC2007/JPEGImages/000019.jpg 306,134,353,189,0 421,124,470,184,0 641,45,711,119,0 550,80,565,96,0 540,65,550,80,0 565,72,579,91,0 335,73,348,91,1 404,72,419,93,1 381,69,393,86,1 284,70,296,87,1 293,72,304,88,1 307,75,317,89,1 318,75,330,91,1 261,74,274,92,1 245,74,256,89,1 232,67,243,83,1 225,71,236,87,1 215,71,226,87,1 207,68,215,78,1 614,83,625,99,0 596,76,607,89,0 607,76,618,86,0 551,66,561,80,0 561,71,570,83,0 544,77,553,92,1 494,69,503,81,0 521,69,529,83,0 368,72,382,91,1 393,71,404,86,0 257,68,268,84,1 29,53,47,74,1 6,67,29,91,0 C:\Users\Dell\Desktop\test\YOLOv3_tiny_TensorFlow-master/VOC2007/JPEGImages/000022.jpg

MC1016 avatar Feb 22 '20 14:02 MC1016

How did you generate your train.txt and test.txt? Can you show me your complete error trace?

rashidch avatar Feb 22 '20 17:02 rashidch

I can solve your issue if you show a complete trace of error. I think the problem can be in data parsing.

rashidch avatar Feb 22 '20 17:02 rashidch

ok, do you have QQ number or wechat? i generate my train.txt and test.txt by voc_annotation.py file the trace of error is File "C:/Users/Dell/Desktop/test/YOLOv3_tiny_TensorFlow-master/train.py", line 179, in image, y_true_13, y_true_26, y_true_52 = dataset_iterator.get_next() i really need your help thank you

MC1016 avatar Feb 22 '20 17:02 MC1016

OutOfRangeError (see above for traceback): End of sequence [[node IteratorGetNext (defined at C:/Users/Dell/Desktop/test/YOLOv3_tiny_TensorFlow-master/train.py:179) = IteratorGetNextoutput_shapes=[, , , ], output_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]] [[{{node IteratorGetNext/_17}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_124_IteratorGetNext", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

MC1016 avatar Feb 22 '20 17:02 MC1016

Traceback (most recent call last): File "C:/Users/win10/Desktop/pycharm/YOLOv3_tiny_TensorFlow-master/train.py", line 150, in train_dataset = train_dataset.apply(tf.data.experimental.map_and_batch( AttributeError: module 'tensorflow.data' has no attribute 'experimental'

MC1016 avatar Feb 23 '20 03:02 MC1016

Traceback (most recent call last): File "C:/Users/win10/Desktop/pycharm/YOLOv3_tiny_TensorFlow-master/train.py", line 150, in train_dataset = train_dataset.apply(tf.data.experimental.map_and_batch( AttributeError: module 'tensorflow.data' has no attribute 'experimental'

Which TensorFlow version did you install? Your data format in train.txt does not match with my train.txt. Your train.txt: C:\Users\Dell\Desktop\test\YOLOv3_tiny_TensorFlow-master/VOC2007/JPEGImages/000002.jpg 37,32,76,84,0 165,103,208,158,0 178,71,213,113,0 221,44,251,88,0 249,61,283,112,0 335,60,376,112,0 344,107,385,163,0 372,59,402,110,0 409,77,454,136,0 9,75,46,124,1

My train.txt: 577 /home/rashid/YOLOv3_TensorFlow-master/data/my_data/GDUT-HWD/JPEGImages/01457.jpg 440 293 1 235 84 258 110 1 291 93 307 115 1 320 96 335 114

rashidch avatar Feb 23 '20 08:02 rashidch

ok, do you have QQ number or wechat? i generate my train.txt and test.txt by voc_annotation.py file the trace of error is File "C:/Users/Dell/Desktop/test/YOLOv3_tiny_TensorFlow-master/train.py", line 179, in image, y_true_13, y_true_26, y_true_52 = dataset_iterator.get_next() i really need your help thank you

I do not have wechat. I have line number :)

rashidch avatar Feb 23 '20 08:02 rashidch

Ok ,my tensorflow version is tensorflow-gpu 1.11.0, which version should i use? how did you get your train.txt file.?

MC1016 avatar Feb 23 '20 08:02 MC1016

Ok ,my tensorflow version is tensorflow-gpu 1.11.0, which version should i use? how did you get your train.txt file.?

Tensorflow version is ok, but I was not using GPU. So you have to find if code is updated for GPU usage. For train.txt, I changed the code which generated this file but unfortunately I could not upload that code. I will share that file with you today but you have to wait.

rashidch avatar Feb 23 '20 08:02 rashidch

thank you so much waiting for your file

MC1016 avatar Feb 23 '20 09:02 MC1016

thank you so much waiting for your file

https://github.com/rashidch/Yolov3_tiny-Hardhat-detection_Tensorflow/blob/master/parse_voc_xml.py You need this file to parse voc data.

rashidch avatar Feb 23 '20 13:02 rashidch

image data folder should look like this one.

rashidch avatar Feb 23 '20 13:02 rashidch

I hope this helps you.

rashidch avatar Feb 23 '20 13:02 rashidch

thank you so much ,its really helpful. but another error occured . AssertionError: Annotation error! Please check your annotation file. Make sure there is at least one target object in each image.

MC1016 avatar Feb 24 '20 04:02 MC1016

I think my dataset should be OK. I can use it normally in other frameworks

MC1016 avatar Feb 24 '20 04:02 MC1016

i guess annotation file is not recognized ? i find the error at data_utils.py but i dont konw how to fix it. I'm going crazy 。。。。。。。。

MC1016 avatar Feb 24 '20 05:02 MC1016

thank you so much ,its really helpful. but another error occured . AssertionError: Annotation error! Please check your annotation file. Make sure there is at least one target object in each image.

I think it means that your annotation file have less number of targets objects (classes maybe). Although the code is generalized red but you need to check to make sure.

rashidch avatar Feb 24 '20 06:02 rashidch

There are only two classes in my dataset. i find in the data_utils.py file. the code assert len(s) > 8, 'Annotation error! Please check your annotation file. Make sure there is at least one target object in each image.' i dont know what does it means

MC1016 avatar Feb 24 '20 06:02 MC1016

There are only two classes in my dataset. i find in the data_utils.py file. the code assert len(s) > 8, 'Annotation error! Please check your annotation file. Make sure there is at least one target object in each image.' i dont know what does it means

Try to check what is s and why its length should be greater than 8.

rashidch avatar Feb 24 '20 07:02 rashidch

i run train.py It works for some steps,but the same error occured again. I still can't fully understand the whole program. Forgive me for being a beginner

MC1016 avatar Feb 24 '20 12:02 MC1016

keep the learning rate (0.0001) low for few steps of training. I try to figure out the better initial hyperparameter. I hope it will solve your problem.

rashidch avatar Feb 24 '20 13:02 rashidch

I try to use lower learningrate and batch size, its still not work.

MC1016 avatar Feb 24 '20 15:02 MC1016