DetectNet + MSCOCO not working
Hey there.
I'm trying to train a DetectNet based on dusty example with MSCOCO. I developed a python script to convert COCO annotations to KITTI format and the built the dataset normally. I used dusty-nv example as basis: https://github.com/dusty-nv/jetson-inference
I adapted the network as follows:
name: "DetectNet" layer { name: "train_data" type: "Data" top: "data" data_param { backend: LMDB source: "/media/antonio/LinuxOutro/DIGITS/digits/jobs/20170522-184156-7112/train_db/features/data.mdb" batch_size: 10 } include: { phase: TRAIN } } layer { name: "train_label" type: "Data" top: "label" data_param { backend: LMDB source: "/media/antonio/LinuxOutro/DIGITS/digits/jobs/20170522-184156-7112/train_db/labels/data.mdb" batch_size: 10 } include: { phase: TRAIN } } layer { name: "val_data" type: "Data" top: "data" data_param { backend: LMDB source: "/media/antonio/LinuxOutro/DIGITS/digits/jobs/20170522-184156-7112/val_db/features/data.mdb" batch_size: 6 } include: { phase: TEST stage: "val" } } layer { name: "val_label" type: "Data" top: "label" data_param { backend: LMDB source: "/media/antonio/LinuxOutro/DIGITS/digits/jobs/20170522-184156-7112/val_db/labels/data.mdb" batch_size: 6 } include: { phase: TEST stage: "val" } } layer { name: "deploy_data" type: "Input" top: "data" input_param { shape { dim: 1 dim: 3 dim: 640 dim: 640 } } include: { phase: TEST not_stage: "val" } }
The LMDB source I pointed to the "data.mdb" files generated by Digits. I couldn't find the lmdb files and considered that these were correct. Is it?
It took 24 hours to train and it's not working at all with mAP zero. It doest give me any bbox even when I test with the train images:
Here are the parameters for model training:
Here an example of image with the following label:
sink 0 0 0 0.0 346.43 180.78 79.66 0 0 0 0 0 0 0
bowl 0 0 0 359.22 385.68 24.07 19.57 0 0 0 0 0 0 0
Edit1: I noticed that my labels are formatted as following:
sink 0 0 0 X Y WIDTH HEIGHT 0 0 0 0 0 0 0
Should I format like this?
sink 0 0 0 X Y X+WIDTH Y+HEIGHT 0 0 0 0 0 0 0
When I submit it to test for detecting with bboxes:
Obviously I messed something up, can you guys help me out here?
Hello, yes it doesn't work with "width" etc. You have to use xmin, ymin and xmax, ymax. Means; x, y, x+width, y+height, as you stated above.
Thank you for you reply.
Also I tried to train the whole MSCOCO dataset with all those classes and I saw in another post that the default detectnet prototxt is single class only. Probably it's also an issue, right? I must add the new classes into my prototxt.
Btw, do you know how the "src" and "dst" fields work in prototxt "object_detection" like this?
object_class: { src: 1 dst: 0} # obj class 1 -> cvg index 0
What exactly is the cvg index?
You may experiment with the following, https://github.com/NVIDIA/caffe/blob/caffe-0.15/examples/kitti/detectnet_network-2classes.prototxt
But, for multiple classes, things are not so easy with detectnet, if you have more than 10-20 classes. Even with 2-3 classes, I recommend them train individually first, then after ensuring it is capable to reach certain mAP levels, start increasing class numbers one at a time. It would be great if Nvidia decides to add another detection model supporting multi classes easier way.
Yes, I will test for single and two classes.
On these lines, what defines src and dst? Because on single class, src meant obj class and now dst means obj class. The "src: 8" is what really gets me. Don't know what defined the number 8.
object_class: { src: 1 dst: 0} # cars -> 0
object_class: { src: 8 dst: 1} # pedestrians -> 1
Please check the following; https://github.com/NVIDIA/caffe/pull/157 https://github.com/NVIDIA/DIGITS/tree/digits-4.0/digits/extensions/data/objectDetection#custom-class-mappings
That's very nice! I will train a single and double class detectnet and post the results here in a few days. Have someone tested a detectnet with 10 or 15 classes and reported the performance?
Thank you so much for now.
Hello, @ontheway16. I managed to get the single class working with custom mappings. However, I tried to apply it to the two class example and got the following error:
I built the Dataset as follows:
And here is my network. I adapted it for two classes:
What am I doing wrong?
I made a diff and found the following:
layer {
name: "cvg/classifier"
type: "Convolution"
bottom: "pool5/drop_s1"
top: "cvg/classifier"
param {
lr_mult: 1
decay_mult: 1
param {
lr_mult: 2
decay_mult: 0
convolution_param {
num_output: 1
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
bias_filler {
type: "constant"
value: 0.
Above, "num_output: 1" is actually '2', in original two-class prototxt. I couldn't see any other problems.
Nice!!! It was the problem. It's training the two-class model right now and I'll post the feedback here soon.
Does it train for on class then for another one? I'm on the 2nd epoch and only class 1 is really training while class 0 is not. I set 30 epochs, maybe will it split 15/15?
As far as I know, it trains for both at the same time. But be prepared to see lower mAP levels than single class trainings. Thats why I said, first get higher map levels for each class. Let it run for a while then then decide whether more data is necessary. That's what I am doing.
It looks like you have to change the layer name as you are training a custom data set. I had the same problem and changing the name of few inception layer worked for me.
@mayankmahajan21 could you share some idea? I am having this very similar problem
@vxgu86 What is your problem? can you elaborate?
It looks like you have to change the layer name as you are training a custom data set. I had the same problem and changing the name of few inception layer worked for me.
Hi, can you further describe what you exactly did? Or may be sharing that network part may help, since I don't understand how exactly the changes made. Thanks!
@ontheway16 visit the following link : https://www.coria.com/insights/blog/computer-vision/training-a-custom-mutliclass-object-detection-model