deep-head-pose
deep-head-pose copied to clipboard
Inconsistency in training loss (300W-LP) and testing loss (AFLW2000). What should be the convergence criterion and when to save best model?
Hi @natanielruiz @tfygg ,
I am training the model again on 300W-LP
dataset with filtered filenames. There is high fluctuation in the training loss as mentioned in previous issues as well #6 and #10. Even though for some iteration the losses are very low:
Epoch [25/25], Iter [600/3825] Losses: Yaw 2.5382, Pitch 25.2214, Roll 18.6293 Epoch [25/25], Iter [700/3825] Losses: Yaw 3.4427, Pitch 56.4101, Roll 60.4185 Epoch [25/25], Iter [800/3825] Losses: Yaw 3.8120, Pitch 10.9580, Roll 12.5700 Epoch [25/25], Iter [900/3825] Losses: Yaw 6.2587, Pitch 36.2516, Roll 29.5404 Epoch [25/25], Iter [1000/3825] Losses: Yaw 3.9143, Pitch 13.5918, Roll 11.6238 Epoch [25/25], Iter [1100/3825] Losses: Yaw 2.8406, Pitch 16.2069, Roll 11.7216 Epoch [25/25], Iter [1200/3825] Losses: Yaw 3.1640, Pitch 6.9615, Roll 3.9374 Epoch [25/25], Iter [1300/3825] Losses: Yaw 4.6969, Pitch 8.0815, Roll 9.0429 Epoch [25/25], Iter [1400/3825] Losses: Yaw 3.1008, Pitch 6.8233, Roll 4.4145 Epoch [25/25], Iter [1500/3825] Losses: Yaw 3.5320, Pitch 53.3095, Roll 41.4802 Epoch [25/25], Iter [1600/3825] Losses: Yaw 3.7685, Pitch 7.2890, Roll 8.7627 Epoch [25/25], Iter [1700/3825] Losses: Yaw 3.2166, Pitch 19.6407, Roll 12.9610 Epoch [25/25], Iter [1800/3825] Losses: Yaw 3.6263, Pitch 6.8446, Roll 5.8751 Epoch [25/25], Iter [1900/3825] Losses: Yaw 3.7254, Pitch 12.2385, Roll 9.0497 Epoch [25/25], Iter [2000/3825] Losses: Yaw 4.3334, Pitch 10.8476, Roll 4.3712 Epoch [25/25], Iter [2100/3825] Losses: Yaw 4.8823, Pitch 13.0971, Roll 17.6704 Epoch [25/25], Iter [2200/3825] Losses: Yaw 2.9647, Pitch 5.1831, Roll 5.9912 Epoch [25/25], Iter [2300/3825] Losses: Yaw 2.6243, Pitch 20.3848, Roll 10.7074 Epoch [25/25], Iter [2400/3825] Losses: Yaw 4.3780, Pitch 16.6918, Roll 10.1041 Epoch [25/25], Iter [2500/3825] Losses: Yaw 2.6419, Pitch 29.8599, Roll 23.3731 Epoch [25/25], Iter [2600/3825] Losses: Yaw 3.0582, Pitch 23.6246, Roll 15.0430 Epoch [25/25], Iter [2700/3825] Losses: Yaw 4.5449, Pitch 11.4036, Roll 9.0669 Epoch [25/25], Iter [2800/3825] Losses: Yaw 3.3777, Pitch 6.4258, Roll 4.7266 Epoch [25/25], Iter [2900/3825] Losses: Yaw 4.5212, Pitch 8.0623, Roll 5.5993 Epoch [25/25], Iter [3000/3825] Losses: Yaw 3.5405, Pitch 11.6594, Roll 9.8117 Epoch [25/25], Iter [3100/3825] Losses: Yaw 2.8780, Pitch 10.0156, Roll 9.4295 Epoch [25/25], Iter [3200/3825] Losses: Yaw 3.9240, Pitch 8.4466, Roll 4.5813 Epoch [25/25], Iter [3300/3825] Losses: Yaw 4.6378, Pitch 8.8315, Roll 8.9284
While testing the model on AFLW2000
I am getting slight high error in Yaw.
Test error in degrees of the model on the 1969 test images. Yaw: 13.6368, Pitch: 7.7751, Roll: 6.1729
I am saving my model based on best iteration where all errors are minimum.
Can you help me to find out the reason. Following is the command I am using to train the model (I am continuing training over a saved model):
train_hopenet.py --data_dir ".\300W_LP" --filename_list "300W_LP_filename_filtered.txt" --snapshot "Pruned_Hopenet_0.5.pth" --batch_size 32 --dataset "Pose_300W_LP" --num_epochs 25 --alpha 1 --output_string "prunedReTrain_0.5_1st" --lr 0.00001
Note : Just my observation, the error for Yaw is less as compare to others while training. However at time of testing on AFLW2000
, error in Yaw is largest. Is this because of the dataset? Did you observed anything like that when you trained and tested your model?