PointPillars icon indicating copy to clipboard operation
PointPillars copied to clipboard

Bad inference result on sample after overfitting on same sample

Open jonasdieker opened this issue 1 year ago • 15 comments

Hi @zhulf0804 ,

I wanted to ensure the model can memorise a single training example. To do this I set the __len__() method in the Dataset to return 1. When training I printed the data_dict to ensure that the same sample was used for each iteration. Since the dataset length was set to 1, each epoch consisted of a single training step.

I visualised the train curves in tensorboard and as expected, all three losses eventually decreased to 0. Then I wanted to visualise the prediction of the model. For this I used the test.py script. However, when running on the same sample from training (000000.bin) the model produces zero predictions.

If I set the score_thr in pointpillar.py to 0, then I get a lot of predictions but they are obviously all very low confidence.

Any idea where I am going wrong?

jonasdieker avatar Jun 23 '23 07:06 jonasdieker

Hi @jonasdieker, It's strange. Could you post the visualized predictions when setting score_thr to 0 ? By the way, did you load the pretrained weight successfully?

zhulf0804 avatar Jun 23 '23 07:06 zhulf0804

Hi thank you for your very fast reply!

Sorry maybe I should have made it clear that I wanted to train from scratch on a single kitti sample to see if I can get decent predictions overfitting. Therefore, no pretrained weights were loaded, instead I loaded the model weights which I saved from my overfit-training run, produced as described above.

The reason: I tried to do the same for NuScenes to test if the model can memorise the new data when overfitting. In this case the model also predicts nothing, however I am not able to get a zero loss after playing with the parameters. So there is likely more parameter adjusting I need to do still ...

Here is the visualisation you asked for. (Note: I am using a different visualisation function because your one did not work for me over ssh)

White is pedestrian, green is cyclist and blue is car.

image

Here are the confidences:

[0.0112691  0.01061759 0.01054672 0.01012148 0.01011159 0.00997026
 0.00983873 0.00945836 0.00936741 0.00894571 0.00888245 0.00886574
 0.00883586 0.00870235 0.00864896 0.00861476 0.00859446 0.00854981
 0.00853697 0.00851393 0.00847296 0.00834575 0.00832187 0.00829636
 0.00829282 0.00826259 0.00825665 0.00825058 0.00824824 0.00824112
 0.00823086 0.00821262 0.00817523 0.00817244 0.00815322 0.00815221
 0.00809674 0.00809228 0.00809175 0.00807787 0.00805884 0.00801394
 0.00799607 0.00798928 0.00394109 0.00385207 0.00380854 0.00376242
 0.00368402 0.00364244]

And the class counts:

[44, 4, 2]

Hope this is somewhat helpful for you!

jonasdieker avatar Jun 23 '23 08:06 jonasdieker

One more comment worth making: In the kitti dataloader I actually commented out the data_augment function.

I did this in order to consistently get the same data for overfitting. I only use point_range_filter even for split="train".

jonasdieker avatar Jun 23 '23 08:06 jonasdieker

Hello @jonasdieker, did you also visualize the G.T. result and the predicted result used the weights provided by this repo on 000000.bin. Are they reasonable ?

zhulf0804 avatar Jun 23 '23 13:06 zhulf0804

Yes, I did and they were fine. That is why I am confused by my experiments outcome!

Edit: I will send a visualisation of that when I have access to the machine again!

jonasdieker avatar Jun 23 '23 13:06 jonasdieker

Ok. One more thing, could you help to verify the single training example is 000000.bin again ?

zhulf0804 avatar Jun 23 '23 13:06 zhulf0804

So I tried it again and verified I was overfitting on the same sample as I was testing on. I tried it with 000000.bin and then also 000001.bin individually, and both times the loss was practically zero but returned no bounding boxes at all with the test.py script and the default setting defined here:

https://github.com/zhulf0804/PointPillars/blob/b9948e73505c8d6bfa631ffdf76c7148e82c5942/model/pointpillars.py#L262-L266

Could you try to repeat this experiment? It should only take a few minutes.

Edit:

When setting the train_dataloader to split="val" and still with the training set length set to 1, I can perform training and validation on the same 000001.bin sample only. The weird thing is that if I look at tensorboard I get the following plots:

image

So now I am even more confused but it confirms that val/test performs really badly in this specific scenario. Especially the class loss actually diverges, which again makes sense why the confidence is so low and all boxes are filtered out by the get_predicted_bboxes_single method with the default params linked above.

jonasdieker avatar Jun 26 '23 08:06 jonasdieker

@zhulf0804 Ok, I think this is kind of interesting:

The only difference between train and val in train.py is the fact model.eval() is called (which of course you should be calling). But if I comment out that line I get the following plots:

image

Doing the same in test.py I get:

image

which is perfect! So, overfitting works exactly as expected with this change. However, I do not understand how this impacts the performance, as changing from train mode to eval mode does the following:

image

I think I need to give this some more thought. Let me know if you have an explanation!

jonasdieker avatar Jun 26 '23 10:06 jonasdieker

Hello, @jonasdieker. Both validation cls loss and visualized prediction (using test.py) become well by just removing model.eval(), like the following line ? https://github.com/zhulf0804/PointPillars/blob/b9948e73505c8d6bfa631ffdf76c7148e82c5942/train.py#L139

zhulf0804 avatar Jun 27 '23 06:06 zhulf0804

Hello @zhulf0804, yes that is exactly right!

jonasdieker avatar Jun 27 '23 07:06 jonasdieker

Ok, and I'm also confused about the result. I'll test it when I have access to the machine. Besides, looking forward to your explanation to this question. Best.

zhulf0804 avatar Jun 27 '23 13:06 zhulf0804

Do you have any updates on this? @jonasdieker did you find out the issue? I am getting the same problem, overfitting on one (or few samples) loss goes to 0, but then 0 predictions using test.py. And even worse, when I run test.py multiple times with NO changes, i get different results (sometimes few bboxes, most of the time zero - [] [] [])

mdane-x avatar Oct 24 '23 18:10 mdane-x

Hi @mdane-x, as far as I remember overfitting on one (or a few) sample(s) didn't work. I ended up commenting out model.eval(). I believe the issue was due to the normalisation. If you have a good explanation of what is going on, please add it here!

jonasdieker avatar Oct 31 '23 14:10 jonasdieker

Hi @jonasdieker, thanks for the answer. I haven't managed to make it work, even after removing the eval() line. I am getting empty predictions with any trained model (on few samples)

mdane-x avatar Oct 31 '23 14:10 mdane-x

@mdane-x, hmmm that is very strange. I am not sure how to help you. In my experience it helps to visualise as much as you can. What does your validation loss look like? Is it also going to zero?

jonasdieker avatar Oct 31 '23 14:10 jonasdieker