drone-net
drone-net copied to clipboard
Drones not detected after training of Tiny YOLOv3
Hi, I've followed your tutorial in order to train Tiny YOLOv3 with drone images. After ~1900 iterations I'm still not getting any detection even on training images, which is strange. I've tested with your weights and everything works, so there must be some problem during my training procedure.
Darknet has been compiled with GPU and the training has been done on Google Colab. Here is the output during the training run with the command ./darknet detector train drone.data cfg/yolov3-tiny-drone.cfg darknet53.conv.74
:
layer filters size input output
0 conv 16 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 16 0.150 BFLOPs
1 max 2 x 2 / 2 416 x 416 x 16 -> 208 x 208 x 16
2 conv 32 3 x 3 / 1 208 x 208 x 16 -> 208 x 208 x 32 0.399 BFLOPs
3 max 2 x 2 / 2 208 x 208 x 32 -> 104 x 104 x 32
4 conv 64 3 x 3 / 1 104 x 104 x 32 -> 104 x 104 x 64 0.399 BFLOPs
5 max 2 x 2 / 2 104 x 104 x 64 -> 52 x 52 x 64
6 conv 128 3 x 3 / 1 52 x 52 x 64 -> 52 x 52 x 128 0.399 BFLOPs
7 max 2 x 2 / 2 52 x 52 x 128 -> 26 x 26 x 128
8 conv 256 3 x 3 / 1 26 x 26 x 128 -> 26 x 26 x 256 0.399 BFLOPs
9 max 2 x 2 / 2 26 x 26 x 256 -> 13 x 13 x 256
10 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 0.399 BFLOPs
11 max 2 x 2 / 1 13 x 13 x 512 -> 13 x 13 x 512
12 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BFLOPs
13 conv 256 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 256 0.089 BFLOPs
14 conv 512 3 x 3 / 1 13 x 13 x 256 -> 13 x 13 x 512 0.399 BFLOPs
15 conv 18 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x 18 0.003 BFLOPs
16 yolo
17 route 13
18 conv 128 1 x 1 / 1 13 x 13 x 256 -> 13 x 13 x 128 0.011 BFLOPs
19 upsample 2x 13 x 13 x 128 -> 26 x 26 x 128
20 route 19 8
21 conv 256 3 x 3 / 1 26 x 26 x 384 -> 26 x 26 x 256 1.196 BFLOPs
22 conv 18 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x 18 0.006 BFLOPs
23 yolo
Loading weights from darknet53.conv.74...Done!
yolov3-tiny-drone
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
Resizing
416
Loaded: 0.000073 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498678, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499832, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498684, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499832, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498682, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498673, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499832, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498681, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498680, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499832, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498676, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498681, .5R: -nan, .75R: -nan, count: 0
1: 315.492737, 315.492737 avg, 0.000000 rate, 1.219311 seconds, 24 images
Loaded: 0.000067 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498674, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498679, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499832, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498677, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498676, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498684, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499832, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498684, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499833, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498679, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499832, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.498675, .5R: -nan, .75R: -nan, count: 0
2: 315.492767, 315.492737 avg, 0.000000 rate, 0.481109 seconds, 48 images
................
Loaded: 0.000072 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.235015, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.036456, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.235015, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.036457, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.235015, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.036457, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.235015, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.036458, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.235015, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.036456, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.235015, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.036454, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.235015, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.036455, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.235015, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.036453, .5R: -nan, .75R: -nan, count: 0
379: 26.248299, 33.163815 avg, 0.000021 rate, 0.740130 seconds, 9096 images
Loaded: 0.000068 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.232247, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.034958, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.232247, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.034960, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.232247, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.034961, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.232246, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.034962, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.232248, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.034957, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.232247, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.034961, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.232247, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.034960, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.232248, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.034960, .5R: -nan, .75R: -nan, count: 0
380: 25.502501, 32.397682 avg, 0.000021 rate, 0.755673 seconds, 9120 images
Resizing
544
..................
Loaded: 0.000052 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000022, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000003, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000022, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000003, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000022, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000003, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000022, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000003, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000022, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000003, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000022, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000003, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000022, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000003, .5R: -nan, .75R: -nan, count: 0
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000022, .5R: -nan, .75R: -nan, count: 0
Region 23 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000003, .5R: -nan, .75R: -nan, count: 0
1930: 0.000000, 0.002069 avg, 0.001000 rate, 0.583884 seconds, 46320 images
Do you notice anything strange that might be related to the problem I'm facing? Thank you so much for your help and for sharing the tutorial
Edit Here is a link to the avg loss plot of another run of training which gives the same problem: https://imgur.com/8IG3yO4
@damnko I see that you always have count = 0. Perhaps check if all your training images are correctly labeled and configured (e.g. in same directory)?
Also, feel free to start training from weights/yolo-drone.weights
instead of from darknet53.conv.74
.
Hi @chuanenlin thanks for your prompt reply. Yes, the images and labels are both in the same folder and I've downloaded everything from your repo, I didn't do the labeling myself, so I suppose they're correct.
That count
should indicate the number of times the object is detected in a specific region for that image? What should I expect to see regarding that number so I can check on the full log?
I will try to start training from your weights, but I suppose it should work anyway, starting from darknet53.conv.74
correct?
@damnko Yes, therefore count = 0
should indicate either some kind of error (e.g. labeling) or that it needs more iterations of training. You should expect to see several lines of non-NAN values for IOU, Class, etc. and count > 0 after decent training iterations and correct configuration.
If you start from darknet53.conv.74
, it would mean starting from "scratch" (take note that I trained for a couple of days on a NVIDIA Tesla GPU) while starting from weights/yolo-drone.weights
would mean starting from where I left off.
Oh, a couple of days... But were you training the standard, non-tiny, version right? But then what does 1930: 0.000000
indicate? Shouldn't that be the loss you also refer to in the blog post?
@damnko That's correct. FYI, feel free to open an issue here and look if anyone has also faced a similar issue.
Ok thanks, I will investigate a little bit more and then will open an issue on the darknet repo. Thanks for now, I will keep you posted and come back to close this issue.
Hi @chuanenlin , sorry to ask you again. But I've tried to follow another tutorial with a different dataset and it worked, now I'm trying the same settings, with your images/labels and I'm having the same problem.
The labels of the other tutorial (which works) is in the following form:
0 0.469603 0.48314199999999996 0.797766 0.795552
While your label has this format:
0 167.5 116.5 175.0 157.0
So, the first seems to have relative positions/dimensions, and yours absolute. Is there something wrong, or both should work? I wonder if the problem is there. Thanks again for your help
@damnko May I ask which repository for YOLO are you using? Some variants such as AlexeyAB's version require different labeling formats, such as the one you mentioned. The labels in this repo should work with the original (pjreddie's) version.
This is the version I compiled: https://github.com/pjreddie/darknet
That's odd - I trained the weights with the labels in this repo so the formatting should be fine. 🤔
I don't know, even when using labelimg for image labeling following YOLO format they look like this...
0 0.534304 0.380769 0.640333 0.584615
Feel free to close the issue since as for now, I think, my problem is solved using this labeling, even though I have not tried to convert your labels in this format.
Thanks for your feedback and thanks again for sharing your work :pray:
@damnko What OS are you on? I am having a similar problem with normal YOLO. Also, what was the other tutorial?
Hi @trigaten , I was using Linux Mint. It was some time ago and can't remember the other tutorial I was looking at but I guess it was this one: https://www.learnopencv.com/training-yolov3-deep-learning-based-custom-object-detector/