What is the best approach to troubleshoot the PCA loss training?
I am currently training a model with 40+ keypoints (hand movement of macaques) and hanving troubles with the pca_singleview loss. All other types of losses and model combinations work fine and reach acceptable RMSEs withing 600-1000 epochs. The pca_singleview model has some reduced validation loss, but then stagnated at 40 RMSE and stays there for a while.
Looking at the train_pca_singleweight_loss, all values are NaN/0.
There seems to be something wrong with the PCA calculation. What would be the best way to go about troubleshooting this?
A few questions for you:
- How many labeled frames do you have? Since we are fitting PCA on the (x, y) coordinates, if you have 40 keypoints then you need to reduce the dimensionality of an 80-dimensional vector. So if you have slightly more than 80 keypoints the PCA space is likely not being estimated well
- Whenever you train with the PCA loss, you should see a printout on the command line that tells you the explained variance per PC, and how many PCs are required to reach 99% of the variance. What do these numbers look like for your dataset?
- What size are you reshaping the images to? 256x256 or 384x384? The size (unfortunately) determines the scale of the heatmap loss, which may require changing the hyperparameter (
log_weight) on the PCA loss. How big is the weighted PCA loss compared to the heatmap loss during training (via tensorboard)? Have you tried different values forlog_weight?
Related, not sure which backbone you're using but we have the following built in: resnet50_human_hand, which is a ResNet-50 pretrained on OneHand10k dataset (Wang et al 2018, Mask-pose Cascaded CNN for 2d Hand Pose Estimation from Single Color Image).
I don't have access to a hand pose estimation dataset, so haven't tried this one myself, but might be worth checking out.
Ah, very nice! I will give it a try :)
Also, do you have a single camera or will you be doing multi-camera/3D pose estimation?
Single camera. We had a DLC model but were disappointed in how well its performing and adding more and more images was not helping it in the end