pytorch-pose icon indicating copy to clipboard operation
pytorch-pose copied to clipboard

Could not train well on small dataset

Open linghai06 opened this issue 7 years ago • 5 comments

Hi bearpaw, I tried the codes with default setting on small dataset, say 4 images and found that it could not train better than using full mpii dataset, with respect to training loss, i.e. the training loss on full mpii dataset is even lower(~ 1e-4) while on a small set of 4 images it's been on a plateau at ~1e-3.

It is very strange. As I think verifying the model and the codes could overfit a small set is necessary before any further step, could you provide some advise for above issue? Thanks.

linghai06 avatar Jan 12 '18 03:01 linghai06

To be more concrete, modified codes and training scripts are attached as follows:

image

self.train, self.valid = self.train[:4], self.valid[:4]

is added in datasets/mpii.py to select only 4 images

And here is the scripts

image

linghai06 avatar Jan 12 '18 03:01 linghai06

I think 4 images are not sufficient to train a pose estimator.

bearpaw avatar Jan 26 '18 18:01 bearpaw

Well, literally it is not training in normal sense but testing if the gradient flow could drive the network to memorize such a simple setting. If not, then probably there are some issues with the network design or the optimizer or anything else. I think it is a verification procedure before any further step.

There is a similar discussion on reddit: https://www.reddit.com/r/MachineLearning/comments/5pidk2/d_is_overfitting_on_a_very_small_data_set_a/

linghai06 avatar Jan 29 '18 09:01 linghai06

Interesting. What is the batchsize you used during training and testing? It is possible that the BN is not trained properly.

bearpaw avatar Jan 29 '18 20:01 bearpaw

Only 4 images were used and the batch-size was set to 4. Currently I only pay attention to the training loss for this 4 images and surprisingly found that it dropped slower than using full mpii and also stopped at a plateau higher.

Why should that be from the BN layers? Actually I also tried a small set with 500 images and the results were more or less the same with that from the set of 4 images. This might indicate it is not from the BN.

linghai06 avatar Feb 02 '18 08:02 linghai06