deep-head-pose Data preprocessing for 300W

Hi!@natanielruiz, Great papers and work! I have a question about data preprocessing, I performed well on the training set, but poorly on the test set. Since I am training a small network, I will limit the input image resolution to 56x56. In 300W_LP, I used your method.

img = Image.open(path)
x_min, y_min = float(cor[index, 0]), float(cor[index, 1])
x_max, y_max = float(cor[index, 2]), float(cor[index, 3])
k = np.random.random_sample() * 0.2 + 0.2
x_min -= 0.6 * k * abs(x_max - x_min)
y_min -= 2 * k * abs(y_max - y_min)
x_max += 0.6 * k * abs(x_max - x_min)
y_max += 0.6 * k * abs(y_max - y_min)
img = img.crop((int(x_min), int(y_min), int(x_max), int(y_max)))
prob = np.random.random_sample()
if prob < 0.5:
    yaws[index] = -yaws[index]
    bins[index] = nums - 1 - bins[index]
    img = img.transpose(Image.FLIP_LEFT_RIGHT)  
prob = np.random.random_sample()
if prob < 0.05:
    img = img.filter(ImageFilter.BLUR)

Finally I will use the bilinear difference method to compress the image size to 56 * 56.

img = img.resize((56, 56), resample=Image.BILINEAR)

Similarly, aflw2000 also follows the above process.

img = Image.open(path)
x_min, y_min = float(cor[index, 0]), float(cor[index, 1])
x_max, y_max = float(cor[index, 2]), float(cor[index, 3])
k = 0.20
x_min -= 2 * k * abs(x_max - x_min)
y_min -= 2 * k * abs(y_max - y_min)
x_max += 2 * k * abs(x_max - x_min)
y_max += 0.6 * k * abs(y_max - y_min)
img = img.crop((int(x_min), int(y_min), int(x_max), int(y_max)))
img = img.resize((56, 56), resample=Image.BILINEAR)

The following figure is the MAE curve of my train, valid, test dataset, Why is my test set so bad? Is it because my data is processed incorrectly? Thanks for your help! git

Jan 16 '20 04:01 CrossEntropy

have you solved the issue? We have the same problem here as well.

Feb 17 '20 20:02 chuzcjoe

Hi, @chuzcjoe Sorry to reply to you so late, I've been busy doing something else lately... In the AFLW2000 landmarks, some points have a minimum value less than 0 and a maximum value greater than width or height. You need to clip them. Hope I can help you!

Mar 05 '20 11:03 CrossEntropy

@CrossEntropy I train the model, but it does not convergence, have you met this problem?

Dec 14 '20 13:12 wqz960

@wqz960 I don’t seem to encounter this problem. Have you cleaned the data set correctly?

Dec 16 '20 06:12 CrossEntropy

@CrossEntropy I did not clean the data, use all the pictures(about 30K) for training. the lowest the losses of yaw, pitch and roll are about +-1.5, and I evaluate the model on AFLW2000, the result is bad, 7.5 for yaw, 10 for pitch, 10 for roll, which has a difference in paper, can you give me some advice? Thank you! My wechat is zz362379625.

Dec 16 '20 06:12 wqz960

@wqz960 In the AFLW2000 landmarks, some points have a minimum value less than 0 and a maximum value greater than width or height. You need to clip them.

Dec 17 '20 02:12 CrossEntropy

Hi,how to train the network correctly I use 300W-LP data set for training, but the loss value has not changed much and can not converge. The following figure shows the result of 300w-lp data set with batch size of 64 and 3 epochs 你好，怎么正确地训练网络？我使用300W-LP数据集进行训练，但LOSS值一直变化不大，也无法收敛。下图是300W-LP数据集，batch-size是64，3个epoch的结果

Jan 29 '21 17:01 kellen5l

How do you get Headpose(Yaw, pitch, roll) from AFLW200 & 300W Datasets??

I have downloaded the data bases. I am trying to process the data and labels. I don't understand where did you get Euler head pose from these datasets.?

For example, I checked one of the .mat file given for a sample. It has a pose_param with 1x7 array. How to get Euler's from here ? 0.085767917 -0.095569924 0.075868770 220.44441 167.35301 -100.53674 0.0012647437

Sep 21 '22 22:09 Susiehub

Hi,how to train the network correctly I use 300W-LP data set for training, but the loss value has not changed much and can not converge. The following figure shows the result of 300w-lp data set with batch size of 64 and 3 epochs 你好，怎么正确地训练网络？我使用300W-LP数据集进行训练，但LOSS值一直变化不大，也无法收敛。下图是300W-LP数据集，batch-size是64，3个epoch的结果

想问您一下那个训练的命令具体是怎么写的呀，关于snapshot这个是参数是指什么呀，感谢

Oct 15 '22 12:10 cunesewangst

@kellen5l 想问您一下那个训练的命令具体是怎么写的呀，关于snapshot这个是参数是指什么呀，感谢

Oct 15 '22 12:10 cunesewangst

@cunesewangst Sorry. I completely forgot.

Oct 16 '22 10:10 kellen5l

Hi,how to train the network correctly I use 300W-LP data set for training, but the loss value has not changed much and can not converge. The following figure shows the result of 300w-lp data set with batch size of 64 and 3 epochs 你好，怎么正确地训练网络？我使用300W-LP数据集进行训练，但LOSS值一直变化不大，也无法收敛。下图是300W-LP数据集，batch-size是64，3个epoch的结果

想问您一下那个训练的命令具体是怎么写的呀，关于snapshot这个是参数是指什么呀，感谢 @cunesewangst 请问你的最后收敛了吗

Sep 14 '23 16:09 lovegit2021

deep-head-pose
deep-head-pose copied to clipboard

Data preprocessing for 300W_LP and AFLW2000.

deep-head-pose deep-head-pose copied to clipboard

Data preprocessing for 300W_LP and AFLW2000.

deep-head-pose
deep-head-pose copied to clipboard