Yolo_mark icon indicating copy to clipboard operation
Yolo_mark copied to clipboard

Yolo training does not stop, and what type of images can be used

Open mwindowshz opened this issue 6 years ago • 19 comments

Hi I am new to working with YOLO and Neural networks

I used your example data to run with yolo no gpu. this is running for about 4 days and not stopping. command: darknet_no_gpu.exe detector train data/obj.data yolo-obj.cfg darknet19_448.conv.23 output started like this: progress of learning Loading weights from darknet19_448.conv.23...Done! Learning Rate: 0.0001, Momentum: 0.9, Decay: 0.0005 Loaded: 1.588000 seconds Region Avg IOU: 0.120223, Class: 0.362698, Obj: 0.612478, No Obj: 0.513480, Avg Recall: 0.000000, count: 9 Region Avg IOU: 0.238924, Class: 0.439833, Obj: 0.548626, No Obj: 0.514007, Avg Recall: 0.058824, count: 34 Region Avg IOU: 0.200323, Class: 0.519404, Obj: 0.561993, No Obj: 0.512736, Avg Recall: 0.166667, count: 12 Region Avg IOU: 0.196986, Class: 0.321407, Obj: 0.634472, No Obj: 0.513765, Avg Recall: 0.000000, count: 18 Region Avg IOU: 0.187150, Class: 0.526126, Obj: 0.575120, No Obj: 0.513986, Avg Recall: 0.083333, count: 24 Region Avg IOU: 0.147942, Class: 0.573413, Obj: 0.527868, No Obj: 0.513225, Avg Recall: 0.000000, count: 11 Region Avg IOU: 0.147899, Class: 0.419107, Obj: 0.490241, No Obj: 0.512636, Avg Recall: 0.000000, count: 18 Region Avg IOU: 0.155855, Class: 0.431433, Obj: 0.527465, No Obj: 0.512580, Avg Recall: 0.000000, count: 11 1: 23.719202, 23.719202 avg, 0.000100 rate, 1525.245972 seconds, 64 images

Now after 4 days it is: Region Avg IOU: 0.776839, Class: 0.999406, Obj: 0.881055, No Obj: 0.007709, Avg Recall: 1.000000, count: 17 Region Avg IOU: 0.789834, Class: 0.998349, Obj: 0.804908, No Obj: 0.010932, Avg Recall: 1.000000, count: 32 Region Avg IOU: 0.716955, Class: 0.998312, Obj: 0.747980, No Obj: 0.008201, Avg Recall: 0.863636, count: 22 Region Avg IOU: 0.791601, Class: 0.999399, Obj: 0.838217, No Obj: 0.008875, Avg Recall: 1.000000, count: 21 Region Avg IOU: 0.774943, Class: 0.997678, Obj: 0.875351, No Obj: 0.006102, Avg Recall: 1.000000, count: 9 Region Avg IOU: 0.764226, Class: 0.998302, Obj: 0.723759, No Obj: 0.011918, Avg Recall: 0.969697, count: 33 Region Avg IOU: 0.760802, Class: 0.998138, Obj: 0.750877, No Obj: 0.014727, Avg Recall: 0.976744, count: 43 Region Avg IOU: 0.755867, Class: 0.999586, Obj: 0.833746, No Obj: 0.005354, Avg Recall: 1.000000, count: 11 207: 0.218326, 0.224815 avg, 0.001000 rate, 1554.923950 seconds, 13248 images Loaded: 0.000000 seconds Region Avg IOU: 0.811563, Class: 0.997593, Obj: 0.791255, No Obj: 0.012824, Avg Recall: 1.000000, count: 31 Region Avg IOU: 0.763254, Class: 0.998242, Obj: 0.847682, No Obj: 0.008106, Avg Recall: 1.000000, count: 19 Region Avg IOU: 0.785790, Class: 0.997373, Obj: 0.738193, No Obj: 0.013735, Avg Recall: 1.000000, count: 37

And not stopping. Again I used your aeroplanes and birds data, as is.

Is this normal. do I have a problem?

Also wanted to ask if you can explain the output Region Avg IOU: Class: Obj: No Obj: Avg Recall: count: and the line that comes after every few "interations?" 207: 0.218326, 0.224815 avg, 0.001000 rate, 1554.923950 seconds, 13248 images

Also wanted to ask, Can I use images that have several classes in them, in your case an image with a bird and a aeroplane? (marking different rectangles with different class tags. can this be in the same image?)

Thanks in advance m.

mwindowshz avatar Jul 24 '17 07:07 mwindowshz

@mwindowshz Hi,

About output explain: https://timebutt.github.io/static/understanding-yolov2-training-output/

Also wanted to ask, Can I use images that have several classes in them, in your case an image with a bird and a aeroplane? (marking different rectangles with different class tags. can this be in the same image?)

Yes, you can use image with marked many different object on the same image. Also you can use image without objects at all, I use it to avoid false-positive detections.

Is this normal. do I have a problem?

There is no problem. But for good detection you should use ~500 - 2000 images per class. And you should train 2000 iterations per class. So on CPU (without GPU) you will train 4000 iterations (for 2 classes) very much time :)

AlexeyAB avatar Jul 24 '17 11:07 AlexeyAB

Hi Thank you for the information, it was very helpful

There is no problem. But for good detection you should use ~500 - 2000 images per class. And you should train 2000 iterations per class. So on CPU (without GPU) you will train 4000 iterations (for 2 classes) very much time :)

Is there a way to set number of iterations per class? From running the code I found that the images are passed in random order, for better training. but is there some parameter counting iterations for each class?

Also I have a data set of small images. around 187x187. Do I have to set the cfg file to reflect this? (setting the width and height in cfg file to be different then 416). I know there is a resize of the image but stretching the image can cause it to be blurey. Also If I change the size in cfg can I still use the pre trained weights ? darknet19_448.conv.23

Also on the other hand , if I use very large images, Full HD with small objects, then resize to 416x416 can make the objects much smaller and they would not be detected. Are these considerations right?

I am running and training the images, (Slow on CPU for now) sometimes I get

Region Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.001821, Avg Recall: -nan(ind), count: 0 I guess this means the network could not find anything in the image?

Another Question about testing. is there a build way to pass a set of images and get the results listed so I can have statistics of how many objects found in image and were classified correctly? It is always possible to write something but if it is already present....

Again I appreciate your help. thanks Allot!

mwindowshz avatar Jul 27 '17 07:07 mwindowshz

Hi,

Is there a way to set number of iterations per class? From running the code I found that the images are passed in random order, for better training. but is there some parameter counting iterations for each class?

No, but you can set total iterations: https://github.com/AlexeyAB/darknet/blob/576dbe12e64697f10aaa15f6ef23e0a3d5eea5b5/cfg/yolo-voc.2.0.cfg#L15 So if you set classes=5 in cfg-file, then set max_batches = 10000.


Also I have a data set of small images. around 187x187. Do I have to set the cfg file to reflect this? (setting the width and height in cfg file to be different then 416). I know there is a resize of the image but stretching the image can cause it to be blurey. Also If I change the size in cfg can I still use the pre trained weights ? darknet19_448.conv.23

You can use darknet19_448.conv.23 in this case. If most of your images in the training-dataset is around 187x187, then you should set width=192 height=192 during training.


Also on the other hand , if I use very large images, Full HD with small objects, then resize to 416x416 can make the objects much smaller and they would not be detected. Are these considerations right?

You can train netowrk on 192x192, but after training you can change width=608 height=608 or 800x800 in cfg-file and use this cfg-file for detection. But it required only if you really need for find small objects.


I am running and training the images, (Slow on CPU for now) sometimes I get

Region Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.001821, Avg Recall: -nan(ind), count: 0 I guess this means the network could not find anything in the image?

If you get nan only sometimes - then that's fine. But if you get nan always, then you dataset is worng labeled.


Another Question about testing. is there a build way to pass a set of images and get the results listed so I can have statistics of how many objects found in image and were classified correctly? It is always possible to write something but if it is already present....

Again I appreciate your help. thanks Allot!

So you need for True Positives. Yolo it show as Recall instead of really recall: https://github.com/AlexeyAB/darknet/issues/35#issuecomment-284221184

You can do: darknet.exe detector recall data/obj.data yolo-obj.cfg backup\yolo-obj_9000.weights as described here: https://github.com/AlexeyAB/darknet#when-should-i-stop-training It can give you such information: 7586 7612 7689 RPs/Img: 68.23 IOU: 77.86% Recall:99.00% where Recall is what you need. (but I would advise you to look at the parameter IoU - the bigger, the better, this is a more general metric)

In the code: fprintf(stderr, "%5d %5d %5d\tRPs/Img: %.2f\tIOU: %.2f%%\tRecall:%.2f%%\n", i, correct, total, (float)proposals/(i+1), avg_iou*100/total, 100.*correct/total); https://github.com/AlexeyAB/darknet/blob/576dbe12e64697f10aaa15f6ef23e0a3d5eea5b5/src/detector.c#L435

AlexeyAB avatar Jul 27 '17 12:07 AlexeyAB

Hello! You say:

Also you can use image without objects at all, I use it to avoid false-positive detections.

but each image must correspond to the text file with the same name containing "object-class x y width height". For jpg-file with no object for detect what you need to write in txt file?

ivanevgenyev avatar Nov 17 '17 13:11 ivanevgenyev

@ivanevgenyev You should have empty .txt-file for these images.

AlexeyAB avatar Nov 20 '17 06:11 AlexeyAB

Hi, I have problem that Avg Recall: 0.000000 is always zero i started training around 2 hours and it is still zero and Region Avg IOU: 0.027899 for example, almost less than 2. my images are 3000*3000 and I have around 1000 objects in each image - just one class. I followed this tutorial https://timebutt.github.io/static/how-to-train-yolov2-to-detect-custom-objects/ Any hint? I put height=2976 width=2976 in the cfg since I have many objects 44 * 44 in each image

any hint?

my cfg is [net]

Testing

batch=64 subdivisions=8

Training

batch=64

subdivisions=8

height=2976 width=2976 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 burn_in=1000 max_batches = 80200 policy=steps steps=40000,60000 scales=.1,.1

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

#######

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[route] layers=-9

[convolutional] batch_normalize=1 size=1 stride=1 pad=1 filters=64 activation=leaky

[reorg] stride=2

[route] layers=-1,-4

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=30 activation=linear

[region] anchors = 1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071 bias_match=1 classes=1 coords=4 num=5 softmax=1 jitter=.3 rescore=1

object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1

absolute=1 thresh = .6 random=1

SteveIb avatar Dec 17 '17 12:12 SteveIb

  • Check your training dataset using this tool: https://github.com/AlexeyAB/Yolo_mark
  • You should use yolo-voc.2.0.cfg instead of yolo-voc.cfg if you use this repo: https://github.com/AlexeyAB/darknet
  • Do you use this https://github.com/AlexeyAB/darknet repo or original repo?
  • For your network resolution 2976x2976 there shouldn't be any problem to detect 1000 objects on one image, because you can detect (2976/13)*(2976/13)*5 = 43245 objects .

AlexeyAB avatar Dec 17 '17 12:12 AlexeyAB

  • Try to base you cfg-file on yolo-voc.2.0.cfg instead of yolo-voc.cfg

  • Try to change 4 lines in the source code - set 1000 instead of 30 in these 4 lines, then recompile, and train again:

  1. https://github.com/AlexeyAB/darknet/blob/75c39f57507ed81f33571271b939de400a69adf0/src/region_layer.c#L30
  2. https://github.com/AlexeyAB/darknet/blob/75c39f57507ed81f33571271b939de400a69adf0/src/region_layer.c#L190
  3. https://github.com/AlexeyAB/darknet/blob/75c39f57507ed81f33571271b939de400a69adf0/src/region_layer.c#L222
  4. https://github.com/AlexeyAB/darknet/blob/75c39f57507ed81f33571271b939de400a69adf0/src/region_layer.c#L259

AlexeyAB avatar Dec 17 '17 12:12 AlexeyAB

After changing the 4 lines in source code and recompiling i got segmentation fault 11 ./darknet detector train cfg/obj.data cfg/yolo-voc.2.0.cfg darknet19_448.conv.23 yolo-voc

layer filters size input output 0 conv 32 3 x 3 / 1 2976 x2976 x 3 -> 2976 x2976 x 32 1 max 2 x 2 / 2 2976 x2976 x 32 -> 1488 x1488 x 32 2 conv 64 3 x 3 / 1 1488 x1488 x 32 -> 1488 x1488 x 64 3 max 2 x 2 / 2 1488 x1488 x 64 -> 744 x 744 x 64 4 conv 128 3 x 3 / 1 744 x 744 x 64 -> 744 x 744 x 128 5 conv 64 1 x 1 / 1 744 x 744 x 128 -> 744 x 744 x 64 6 conv 128 3 x 3 / 1 744 x 744 x 64 -> 744 x 744 x 128 7 max 2 x 2 / 2 744 x 744 x 128 -> 372 x 372 x 128 8 conv 256 3 x 3 / 1 372 x 372 x 128 -> 372 x 372 x 256 9 conv 128 1 x 1 / 1 372 x 372 x 256 -> 372 x 372 x 128 10 conv 256 3 x 3 / 1 372 x 372 x 128 -> 372 x 372 x 256 11 max 2 x 2 / 2 372 x 372 x 256 -> 186 x 186 x 256 12 conv 512 3 x 3 / 1 186 x 186 x 256 -> 186 x 186 x 512 13 conv 256 1 x 1 / 1 186 x 186 x 512 -> 186 x 186 x 256 14 conv 512 3 x 3 / 1 186 x 186 x 256 -> 186 x 186 x 512 15 conv 256 1 x 1 / 1 186 x 186 x 512 -> 186 x 186 x 256 16 conv 512 3 x 3 / 1 186 x 186 x 256 -> 186 x 186 x 512 17 max 2 x 2 / 2 186 x 186 x 512 -> 93 x 93 x 512 18 conv 1024 3 x 3 / 1 93 x 93 x 512 -> 93 x 93 x1024 19 conv 512 1 x 1 / 1 93 x 93 x1024 -> 93 x 93 x 512 20 conv 1024 3 x 3 / 1 93 x 93 x 512 -> 93 x 93 x1024 21 conv 512 1 x 1 / 1 93 x 93 x1024 -> 93 x 93 x 512 22 conv 1024 3 x 3 / 1 93 x 93 x 512 -> 93 x 93 x1024 23 conv 1024 3 x 3 / 1 93 x 93 x1024 -> 93 x 93 x1024 24 conv 1024 3 x 3 / 1 93 x 93 x1024 -> 93 x 93 x1024 25 route 16 26 reorg / 2 186 x 186 x 512 -> 93 x 93 x2048 27 route 26 24 28 conv 1024 3 x 3 / 1 93 x 93 x3072 -> 93 x 93 x1024 29 conv 30 1 x 1 / 1 93 x 93 x1024 -> 93 x 93 x 30 30 detection mask_scale: Using default '1.000000' Loading weights from darknet19_448.conv.23...Done! Learning Rate: 0.0001, Momentum: 0.9, Decay: 0.0005 Loaded: 23.269316 seconds Segmentation fault: 11

SteveIb avatar Dec 17 '17 13:12 SteveIb

yolo-voc.2.0.cfg

[net] batch=64 subdivisions=8 height=2976 width=2976 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.0001 max_batches = 45000 policy=steps steps=100,25000,35000 scales=10,.1,.1

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

#######

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[route] layers=-9

[reorg] stride=2

[route] layers=-1,-3

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=30 activation=linear

[region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 bias_match=1 classes=1 coords=4 num=5 softmax=1 jitter=.2 rescore=1

object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1

absolute=1 thresh = .6 random=0

SteveIb avatar Dec 17 '17 13:12 SteveIb

on my older yolo-obj.cfg, the results: ./darknet detector train cfg/obj.data cfg/yolo-obj.cfg darknet19_448.conv.23 yolo-obj layer filters size input output 0 conv 32 3 x 3 / 1 2976 x2976 x 3 -> 2976 x2976 x 32 1 max 2 x 2 / 2 2976 x2976 x 32 -> 1488 x1488 x 32 2 conv 64 3 x 3 / 1 1488 x1488 x 32 -> 1488 x1488 x 64 3 max 2 x 2 / 2 1488 x1488 x 64 -> 744 x 744 x 64 4 conv 128 3 x 3 / 1 744 x 744 x 64 -> 744 x 744 x 128 5 conv 64 1 x 1 / 1 744 x 744 x 128 -> 744 x 744 x 64 6 conv 128 3 x 3 / 1 744 x 744 x 64 -> 744 x 744 x 128 7 max 2 x 2 / 2 744 x 744 x 128 -> 372 x 372 x 128 8 conv 256 3 x 3 / 1 372 x 372 x 128 -> 372 x 372 x 256 9 conv 128 1 x 1 / 1 372 x 372 x 256 -> 372 x 372 x 128 10 conv 256 3 x 3 / 1 372 x 372 x 128 -> 372 x 372 x 256 11 max 2 x 2 / 2 372 x 372 x 256 -> 186 x 186 x 256 12 conv 512 3 x 3 / 1 186 x 186 x 256 -> 186 x 186 x 512 13 conv 256 1 x 1 / 1 186 x 186 x 512 -> 186 x 186 x 256 14 conv 512 3 x 3 / 1 186 x 186 x 256 -> 186 x 186 x 512 15 conv 256 1 x 1 / 1 186 x 186 x 512 -> 186 x 186 x 256 16 conv 512 3 x 3 / 1 186 x 186 x 256 -> 186 x 186 x 512 17 max 2 x 2 / 2 186 x 186 x 512 -> 93 x 93 x 512 18 conv 1024 3 x 3 / 1 93 x 93 x 512 -> 93 x 93 x1024 19 conv 512 1 x 1 / 1 93 x 93 x1024 -> 93 x 93 x 512 20 conv 1024 3 x 3 / 1 93 x 93 x 512 -> 93 x 93 x1024 21 conv 512 1 x 1 / 1 93 x 93 x1024 -> 93 x 93 x 512 22 conv 1024 3 x 3 / 1 93 x 93 x 512 -> 93 x 93 x1024 23 conv 1024 3 x 3 / 1 93 x 93 x1024 -> 93 x 93 x1024 24 conv 1024 3 x 3 / 1 93 x 93 x1024 -> 93 x 93 x1024 25 route 16 26 conv 64 1 x 1 / 1 186 x 186 x 512 -> 186 x 186 x 64 27 reorg / 2 186 x 186 x 64 -> 93 x 93 x 256 28 route 27 24 29 conv 1024 3 x 3 / 1 93 x 93 x1280 -> 93 x 93 x1024 30 conv 30 1 x 1 / 1 93 x 93 x1024 -> 93 x 93 x 30 31 detection mask_scale: Using default '1.000000' Loading weights from darknet19_448.conv.23...Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Resizing 448 Loaded: 4.002774 seconds Region Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.492963, Avg Recall: nan, count: 0 Region Avg IOU: 0.008950, Class: 1.000000, Obj: 0.693369, No Obj: 0.494811, Avg Recall: 0.000000, count: 123 Region Avg IOU: 0.042687, Class: 1.000000, Obj: 0.667997, No Obj: 0.492357, Avg Recall: 0.000000, count: 14 Region Avg IOU: nan, Class: nan, Obj: nan, No Obj: 0.494897, Avg Recall: nan, count: 0 Region Avg IOU: 0.017823, Class: 1.000000, Obj: 0.639639, No Obj: 0.492845, Avg Recall: 0.000000, count: 125

SteveIb avatar Dec 17 '17 13:12 SteveIb

No I have c# program which find me the objects and I changed to Yolo format using this tutorial https://timebutt.github.io/static/how-to-train-yolov2-to-detect-custom-objects/ there is code for convert, it works well, I fed it to network and the results as above.

SteveIb avatar Dec 17 '17 13:12 SteveIb

Hi,

I am using yolo-obj.cfg which is as follow.

I have trained my dataset with 70 images.

[net]

Testing

#batch=1 #subdivisions=1

Training

batch=64 subdivisions=8 height=416 width=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 burn_in=1000 max_batches = 80200 policy=steps steps=5,10 scales=.1,.1 i_snapshot_iteration=2

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

#######

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[route] layers=-9

[convolutional] batch_normalize=1 size=1 stride=1 pad=1 filters=64 activation=leaky

[reorg] stride=2

[route] layers=-1,-4

[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=35 activation=linear

[region] anchors = 1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071 bias_match=1 classes=2 coords=4 num=5 softmax=1 jitter=.3 rescore=1

object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1

absolute=1 thresh = .6 random=1

But I am getting this nan class,object and recall.

mask_scale: Using default '1.000000' Total BFLOPS 29.340 Loading weights from build/darknet/x64/data/yolo3.weights... seen 64 Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Resizing 576 x 576 Loaded: 0.783732 seconds Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.517770, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.517663, Avg Recall: -nan, count: 0

Please suggest me some solution if anyone has faced the same .

kpreeti096 avatar Jun 14 '18 11:06 kpreeti096

@kpreeti096 This is normal.

https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

Note: If during training you see nan values for avg (loss) field - then training goes wrong, but if nan is in some other lines - then training goes well.

AlexeyAB avatar Jun 14 '18 11:06 AlexeyAB

I am getting this while training. Is that a problem?? Region 16 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.227234, .5R: -nan(ind), .75R: -nan(ind), count: 0 Region 23 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.181644, .5R: -nan(ind), .75R: -nan(ind), count: 0

class_id = 0, name = table, ap = 0.00 % for thresh = 0.25, precision = 0.00, recall = -nan(ind), F1-score = -nan(ind) for thresh = 0.25, TP = 0, FP = 24, FN = 0, average IoU = 0.00 %

mean average precision (mAP) = 0.000000, or 0.00 %

tuneshverma avatar Jul 18 '18 06:07 tuneshverma

i am training 5 classes on cpu(intel core i7-5500 2.4Ghz) 8gb ram. how many pictures should i train per classes to have good result? and how long will it take to finish?

Brandy24 avatar Feb 12 '20 23:02 Brandy24

i am training 5 classes on cpu(intel core i7-5500 2.4Ghz) 8gb ram. how many pictures should i train per classes to have good result? and how long will it take to finish?

Do you how to stop the training and where the output is obtained?

YARRS avatar Apr 30 '20 04:04 YARRS

@AlexeyAB: I am trying to train a custom object detection model using yolov3.cfg(file making the required changes) and darknet53.conv.74 pre trained weights. But I am getting the following output: v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 82 Avg (IOU: 0.390918), count: 5, class_loss = 293.062775, iou_loss = 1.763977, total_loss = 294.826752 v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 94 Avg (IOU: 0.358087), count: 2, class_loss = 1011.123230, iou_loss = 0.429443, total_loss = 1011.552673 v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 106 Avg (IOU: 0.000000), count: 1, class_loss = 4316.066406, iou_loss = 0.000000, total_loss = 4316.066406 total_bbox = 7, rewritten_bbox = 0.000000 % v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 82 Avg (IOU: 0.376261), count: 7, class_loss = 293.145996, iou_loss = 1.990112, total_loss = 295.136108 v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 94 Avg (IOU: 0.078115), count: 1, class_loss = 1010.538513, iou_loss = 1.803406, total_loss = 1012.341919 v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 106 Avg (IOU: 0.000000), count: 1, class_loss = 4321.592773, iou_loss = 0.000000, total_loss = 4321.592773 total_bbox = 15, rewritten_bbox = 0.000000 %

  1. Could you please explain me the output? Which KPI explains the number of iteration here.
  2. Also, my weights are not getting saved into backup folder once I stop training even after 12hrs training consisting of 1 class with 150 training samples.
  3. Just to confirm could you please tell me the steps to stop the model training. Thanks!

ketand334 avatar Jan 29 '21 05:01 ketand334

@tunesh did you get the reason for it and were you able to solve this? I am also having the same issue.

athithya-raj avatar Jun 27 '22 05:06 athithya-raj