ATVGnet
ATVGnet copied to clipboard
Confused about "normLmarks" function
Many thanks for this repo. I am trying to reimplement your training process but I am stucked in data preprocessing.
Actually, I am confused about "normLmarks" function.
-
I wonder that when there only exist one face for one frame ( len(lmarks.shape) == 2 ), will "normLmarks" always output with the same results? I mark related lines in your code with "#". It seems @ssinha89 also found this issue. https://github.com/lelechen63/ATVGnet/issues/17#issuecomment-547884824.
-
Would you tell me more about the meaning of "init_params", "params" and "predicted"? What does "S" or "SK" mean here? I know you use "procrustes" to align each landmarks to mean face, but I am confused about the process after that. Or could you provide some related papers for how to do that?
def normLmarks(lmarks):
norm_list = []
idx = -1
max_openness = 0.2
mouthParams = np.zeros((1, 100))
mouthParams[:, 1] = -0.06
tmp = deepcopy(MSK)
tmp[:, 48*2:] += np.dot(mouthParams, SK)[0, :, 48*2:]
open_mouth_params = np.reshape(np.dot(S, tmp[0, :] - MSK[0, :]), (1, 100))
if len(lmarks.shape) == 2:
lmarks = lmarks.reshape(1,68,2)
for i in range(lmarks.shape[0]):
mtx1, mtx2, disparity = procrustes(ms_img, lmarks[i, :, :])
mtx1 = np.reshape(mtx1, [1, 136])
mtx2 = np.reshape(mtx2, [1, 136])
norm_list.append(mtx2[0, :])
pred_seq = []
init_params = np.reshape(np.dot(S, norm_list[idx] - mtx1[0, :]), (1, 100))
for i in range(lmarks.shape[0]):
params = np.reshape(np.dot(S, norm_list[i] - mtx1[0, :]), (1, 100)) - init_params - open_mouth_params
######## "params" will always be equal to (-open_mouth_params) ########
predicted = np.dot(params, SK)[0, :, :] + MSK
pred_seq.append(predicted[0, :])
return np.array(pred_seq), np.array(norm_list), 1
Many thanks for this repo. I am trying to reimplement your training process but I am stucked in data preprocessing.
Actually, I am confused about "normLmarks" function.
- I wonder that when there only exist one face for one frame ( len(lmarks.shape) == 2 ), will "normLmarks" always output with the same results? I mark related lines in your code with "#". It seems @ssinha89 also found this issue. #17 (comment).
- Would you tell me more about the meaning of "init_params", "params" and "predicted"? What does "S" or "SK" mean here? I know you use "procrustes" to align each landmarks to mean face, but I am confused about the process after that. Or could you provide some related papers for how to do that?
def normLmarks(lmarks): norm_list = [] idx = -1 max_openness = 0.2 mouthParams = np.zeros((1, 100)) mouthParams[:, 1] = -0.06 tmp = deepcopy(MSK) tmp[:, 48*2:] += np.dot(mouthParams, SK)[0, :, 48*2:] open_mouth_params = np.reshape(np.dot(S, tmp[0, :] - MSK[0, :]), (1, 100)) if len(lmarks.shape) == 2: lmarks = lmarks.reshape(1,68,2) for i in range(lmarks.shape[0]): mtx1, mtx2, disparity = procrustes(ms_img, lmarks[i, :, :]) mtx1 = np.reshape(mtx1, [1, 136]) mtx2 = np.reshape(mtx2, [1, 136]) norm_list.append(mtx2[0, :]) pred_seq = [] init_params = np.reshape(np.dot(S, norm_list[idx] - mtx1[0, :]), (1, 100)) for i in range(lmarks.shape[0]): params = np.reshape(np.dot(S, norm_list[i] - mtx1[0, :]), (1, 100)) - init_params - open_mouth_params ######## "params" will always be equal to (-open_mouth_params) ######## predicted = np.dot(params, SK)[0, :, :] + MSK pred_seq.append(predicted[0, :]) return np.array(pred_seq), np.array(norm_list), 1
Please refer to https://github.com/eeskimez/Talking-Face-Landmarks-from-Speech for audio to landmark part
Many thanks for this repo. I am trying to reimplement your training process but I am stucked in data preprocessing. Actually, I am confused about "normLmarks" function.
- I wonder that when there only exist one face for one frame ( len(lmarks.shape) == 2 ), will "normLmarks" always output with the same results? I mark related lines in your code with "#". It seems @ssinha89 also found this issue. #17 (comment).
- Would you tell me more about the meaning of "init_params", "params" and "predicted"? What does "S" or "SK" mean here? I know you use "procrustes" to align each landmarks to mean face, but I am confused about the process after that. Or could you provide some related papers for how to do that?
def normLmarks(lmarks): norm_list = [] idx = -1 max_openness = 0.2 mouthParams = np.zeros((1, 100)) mouthParams[:, 1] = -0.06 tmp = deepcopy(MSK) tmp[:, 48*2:] += np.dot(mouthParams, SK)[0, :, 48*2:] open_mouth_params = np.reshape(np.dot(S, tmp[0, :] - MSK[0, :]), (1, 100)) if len(lmarks.shape) == 2: lmarks = lmarks.reshape(1,68,2) for i in range(lmarks.shape[0]): mtx1, mtx2, disparity = procrustes(ms_img, lmarks[i, :, :]) mtx1 = np.reshape(mtx1, [1, 136]) mtx2 = np.reshape(mtx2, [1, 136]) norm_list.append(mtx2[0, :]) pred_seq = [] init_params = np.reshape(np.dot(S, norm_list[idx] - mtx1[0, :]), (1, 100)) for i in range(lmarks.shape[0]): params = np.reshape(np.dot(S, norm_list[i] - mtx1[0, :]), (1, 100)) - init_params - open_mouth_params ######## "params" will always be equal to (-open_mouth_params) ######## predicted = np.dot(params, SK)[0, :, :] + MSK pred_seq.append(predicted[0, :]) return np.array(pred_seq), np.array(norm_list), 1Please refer to https://github.com/eeskimez/Talking-Face-Landmarks-from-Speech for audio to landmark part
Thanks. I get the information.
@wjbKimberly @lelechen63 Hi there, i also encountered the same problem. I want to train the ATNet with my own dataset, the landmark data is preprocessed using code extracted from demo.py, the preprocessed landmark data is all the same. Now what confuse me is that is this normal? If this is not correct, did you solve this problem? Could you give some suggestion where went wrong? Thank you!
@wjbKimberly @lelechen63 Hi there, i also encountered the same problem. I want to train the ATNet with my own dataset, the landmark data is preprocessed using code extracted from demo.py, the preprocessed landmark data is all the same. Now what confuse me is that is this normal? If this is not correct, did you solve this problem? Could you give some suggestion where went wrong? Thank you!
@hot-dog i find 'example_landmark' nerver change in demo.py when you change template image of input, similar to 'mean value' does not need to change? what should i do when training? but it does not match with picture in paper? confused,,,,,,