DIGITS icon indicating copy to clipboard operation
DIGITS copied to clipboard

Car-Pedestrian multi class object detection DetectNet mAP and inference

Open eweill opened this issue 7 years ago • 31 comments

I have been working on multi-class detection for cars and pedestrians using DetectNet, as mentioned in MCOD. I can successfully train a single-class DetectNet model for cars and pedestrians (separately); however I am unable to train the 2 class network correctly. Note that when I try to train the pedestrian network on KITTI, I am achieving great classification but the mAP value does not reflect that (it hovers around 10 while I can get upwards of 55 for cars).

With that said, when I try to use the 2 class detect net with a dataset created from KITTI using dontcare,car,pedestrian it only seems to detect cars no pedestrians at all (i.e. the mAP for class 1 remains 0 for the entire training process). I would like to perform inference on this, however with an mAP of 0, pedestrians aren't detected when using inference.

Can anyone point me in the right direction when training the 2-class network?

eweill avatar Dec 23 '16 00:12 eweill

@eweill

You need to change the .prototxt file for detecting multi classes. Refer the links below

https://github.com/NVIDIA/caffe/blob/caffe-0.15/examples/kitti/detectnet_network.prototxt https://github.com/NVIDIA/caffe/blob/caffe-0.15/examples/kitti/detectnet_network-2classes.prototxt

varunvv avatar Dec 23 '16 12:12 varunvv

It's notoriously difficult to train multi-class DetectNet however in case this helps, I found it easier to first train a single-class network (cars) and then fine-tune on the 2-class network (cars + pedestrians).

gheinrich avatar Dec 23 '16 12:12 gheinrich

@gheinrich Since I have trained both a single class car and pedestrian network, I would like to try and fine-tune on a 2-class network. However, I am unable to use the weights directly due to dimensionality of the classifier layer.

I saved my 1-class car trained network as a "pre-trained model" in DIGITS. I then selected the 1-class model, and customized it by pasting in the 2-class prototxt. After selecting all my other desired parameters, I started the training and received the following error:

ERROR: Cannot copy param 0 weights from layer 'cvg/classifier'; shape mismatch. Source param shape is 1 1024 1 1 (1024); target param shape is 2 1024 1 1 (2048). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

Is there an easy way to convert either the model file or the weights file to "fine-tune" in the manner that you were suggesting or am I thinking about it entirely incorrectly? Thanks.

eweill avatar Dec 27 '16 05:12 eweill

Why don't you rename cvg/classifier as suggested in the error message?

gheinrich avatar Dec 28 '16 21:12 gheinrich

@gheinrich Thanks. I realized that as soon as I posted the Error message and got that fixed. I still am unable to train the 2-class network. I simply use a single class network that is trained on cars (with an mAP of 62, so it performs quite well) and try to "fine-tune it on pedestrians as the second class.

I replaced the custom network with a 2-class DetectNet, performed mean subtraction, and set other hyper parameters (learning rate, learning rate policy, batch size/accumulation, etc.) and no matter how I set the hyperparameters, class 2 remains at 0 mAP no matter how many iterations I choose to train the network. Am I missing an important piece to make the network train on the second class as well? Below are some of the current results I am getting.

screen shot 2017-01-02 at 4 10 19 pm

eweill avatar Jan 02 '17 18:01 eweill

@eweill Hi there, did you manage to resolve the problem you had mentioned? I am facing the same problem...

ShervinAr avatar Jan 14 '17 07:01 ShervinAr

@ShervinAr I haven't resolved the problem yet. I have tried many different approaches and no matter how I create the LMDB dataset or train the model (different hyperparameters), I can't seem to get the second class to ever detect (mAP stays at 0 for 60 or 120 epoch).

eweill avatar Jan 15 '17 16:01 eweill

@eweill My guess is that the problem might have roots in : 1-different learning rates required for learning cars vs. pedestrians 2-insufficiency of training/val images for pedestrian detection: not all images in the training set include pedestrians and I am not sure how the network would react to this phenomenon...

ShervinAr avatar Jan 16 '17 07:01 ShervinAr

@eweill Hello. I have a similar problem and have a basic question. For multi class detection do I have to change the dataset first? I mean if the dataset label files has 10 class types (car, pedestrian, van, pickup, ...) do I have to delete the lines for van, pickup, ... or just setting the dontcare, car in the dataset creation in digits is enough?

hadign20 avatar Mar 24 '17 04:03 hadign20

Hello All,

I used the "detectnet_network-2classes.prototxt" file to train my network on Kitti data for cars and pedestrians. But I am getting the following error while training: << error code -11 Train net output #0: loss_bbox = 1.12773 (* 2 = 2.25547 loss) Train net output #1: loss_coverage = 4.84529 (* 1 = 4.84529 loss) Iteration 1264, lr = 0.0001 Snapshotting to binary proto file snapshot_iter_1276.caffemodel Snapshotting solver state to binary proto file snapshot_iter_1276.solverstate Iteration 1276, Testing net (#0) Ignoring source layer train_data Ignoring source layer train_label Ignoring source layer train_transform

Any suggestions?

Thanks Deepika

deagarwa avatar Mar 31 '17 11:03 deagarwa

Training multi-class network based on DetectNet is not very easy, recently I got some better results. I will share some tips on how to train this type network. download

lbin avatar Mar 31 '17 15:03 lbin

@lbin have you published your tips?

RomanSteinberg avatar Apr 11 '17 13:04 RomanSteinberg

^I am also wondering the same. I am running into similar problems with the other class have 0 mAP throughout the whole training procedure.

shinaushin avatar May 26 '17 22:05 shinaushin

@varunvv @lbin @gheinrich

I noticed that on 2-classes prototxt there's these lines: object_class: { src: 1 dst: 0} # cars -> 0 object_class: { src: 8 dst: 1} # pedestrians -> 1

1-Could you explain the "src" field? The "dst" field seems to be the class index as the comment implies. 2-Since we identify classes as index (0 and 1 for example), how does it matches label names on Kitti format (cars as 0, pedestrian as 1):

Ty so much.

antoniodourado avatar May 27 '17 07:05 antoniodourado

Hello, the src field needs to be set to the index of your class in the class mappings (see https://github.com/NVIDIA/DIGITS/blob/digits-5.0/digits/extensions/data/objectDetection/README.md#custom-class-mappings).

gheinrich avatar May 28 '17 09:05 gheinrich

Hello,

has anyone successfully trained a 2-class detectnet? Can anyone share some tips regarding this topic?

Thank you. AP

linuzlover avatar Jul 10 '17 16:07 linuzlover

Iam getting single class detected.please someone helps me to understand why the 2 class detection not working even after the above network is used ,with kitty dataset.

sulth avatar Sep 18 '17 14:09 sulth

@lbin Could you share your results? i have the same problem with only one class learning correctly.

aprentis avatar Sep 21 '17 08:09 aprentis

@lbin please share the tips on 2 class training.i tried different approaches but failed to learn pedestrian.It is learning the car only.

sulth avatar Sep 23 '17 09:09 sulth

@lbin Please someone help me to make 2 class detection possible.I have tried many approaches but failed to.

sulth avatar Sep 23 '17 09:09 sulth

Anyone?

elaith9 avatar Nov 28 '17 09:11 elaith9

@elaith9 I managed to train two classes on the MS COCO data set. This requires a large amount of time and pre-weighed scales. One of the classes after 300 epochs did not rise above 1 map, the second class 5 map. It looks like the tiny-yolo will work much more reliably. SSD for TensoroRT have not yet been implemented, so there is a chance of trying to succeed with networks such as Mobilenet, SqueezeNet.

Maxfashko avatar Nov 28 '17 19:11 Maxfashko

@eweill How did you fix this ERROR: Cannot copy param 0 weights from layer 'cvg/classifier'; shape mismatch. Source param shape is 1 1024 1 1 (1024); target param shape is 2 1024 1 1 (2048)?

adithya-p avatar Mar 24 '18 08:03 adithya-p

@adithya-p @sulth @linuzlover @shinaushin Ok, I see a lot of you are struggling with the same problem as I did. I've decided to write two blog posts about it and explain in details how to do custom multiclass object detection with DIGITS. There you go:

https://labs.coria.com/blog/computer-vision/PreparingDataForCustomObjectDetectionUsingNvidiaDigits?sc_camp=33AFA8630062426190B5760C8FDF17CF https://labs.coria.com/en/blog/computer-vision/TrainingACustomMulticlassObjectDetectionModel?sc_camp=33AFA8630062426190B5760C8FDF17CF

If you have any questions, feel free to ask.

elaith9 avatar Mar 24 '18 10:03 elaith9

@elaith9 Thank you so much but the link died. Can someone share us how to train multiclass detectnet on DIGITS? (may be with KITTI dataset or other dataset)

dqthebt24 avatar Jun 14 '18 14:06 dqthebt24

@dqthebt24 links should be working now. Sorry.

elaith9 avatar Jun 14 '18 14:06 elaith9

Thank you so much @elaith9

dqthebt24 avatar Jun 14 '18 23:06 dqthebt24

Can DetectNet be used to train models for detecting more than 2 classes? If yes, are the changes to be made similar to the link u posted, @elaith9 ?

Adithyak1998 avatar Jun 27 '18 06:06 Adithyak1998

@Adithyak1998 Yes, DetectNet can be used to train models for detecting more than 2 classes. The changes are similar to the links posted above. Use diffchecker to get an insight of what's happening.

adithya-p avatar Jun 27 '18 09:06 adithya-p

@elaith9 if you don't mind , can you re-update you links . they doesn't work with me.

mindmad avatar Apr 19 '19 03:04 mindmad