pytorch-pose
pytorch-pose copied to clipboard
Could not train well on small dataset
Hi bearpaw, I tried the codes with default setting on small dataset, say 4 images and found that it could not train better than using full mpii dataset, with respect to training loss, i.e. the training loss on full mpii dataset is even lower(~ 1e-4) while on a small set of 4 images it's been on a plateau at ~1e-3.
It is very strange. As I think verifying the model and the codes could overfit a small set is necessary before any further step, could you provide some advise for above issue? Thanks.
To be more concrete, modified codes and training scripts are attached as follows:
self.train, self.valid = self.train[:4], self.valid[:4]
is added in datasets/mpii.py to select only 4 images
And here is the scripts
I think 4 images are not sufficient to train a pose estimator.
Well, literally it is not training in normal sense but testing if the gradient flow could drive the network to memorize such a simple setting. If not, then probably there are some issues with the network design or the optimizer or anything else. I think it is a verification procedure before any further step.
There is a similar discussion on reddit: https://www.reddit.com/r/MachineLearning/comments/5pidk2/d_is_overfitting_on_a_very_small_data_set_a/
Interesting. What is the batchsize you used during training and testing? It is possible that the BN is not trained properly.
Only 4 images were used and the batch-size was set to 4. Currently I only pay attention to the training loss for this 4 images and surprisingly found that it dropped slower than using full mpii and also stopped at a plateau higher.
Why should that be from the BN layers? Actually I also tried a small set with 500 images and the results were more or less the same with that from the set of 4 images. This might indicate it is not from the BN.