Co-Speech_Gesture_Generation icon indicating copy to clipboard operation
Co-Speech_Gesture_Generation copied to clipboard

dim of output

Open zhewei-mt opened this issue 1 year ago • 11 comments

Hello, I notice that the output of the model is 216, which is 18x12, and can be converted to 18x6 from rotation matrix to euler angle, where 18 is the number of joints. From my knowledge, for each joint, euler angle is a 1x3 vector and I am confused about the meaning of your ouput. Can you please provide more information about the meaning of the output, e.g. the origin of each joint, relative rotation respect to its parent joint or absolute rotaion? Thanks in advance!

zhewei-mt avatar May 08 '23 10:05 zhewei-mt

Hello, I suspect you're using 24 joints (incl both upper and lower body parts). A rotation matrix has 9 values so that 24 x 9 is 216.

youngwoo-yoon avatar May 09 '23 06:05 youngwoo-yoon

I double checked the code in "inference.py". Below is line 152: out_poses = out_poses.reshape((out_poses.shape[0], -1, 12)) # (n_frames, n_joints, 12) It reshapes the output to (n_frames, n_joints, 12), which gives n_joints to be 18. I don't understand why it's of shape 18x12.

zhewei-mt avatar May 11 '23 02:05 zhewei-mt

Thanks for pointing that out. I'll take a look.

youngwoo-yoon avatar May 11 '23 04:05 youngwoo-yoon

Sorry for the confusion. 3 for x, y, z positions and 9 for a rotation matrix, so 12 in total. Please refer to this: https://github.com/youngwoo-yoon/Co-Speech_Gesture_Generation/blob/46671fc6af8a3dbcb93cdafca4f07bc064a2c3b1/scripts/twh_dataset_to_lmdb.py#L46

youngwoo-yoon avatar May 14 '23 13:05 youngwoo-yoon

Thanks for clarify. Do all joints share the same origin? Is there any way to convert the ratation information to be compatiable SMPL?

zhewei-mt avatar May 15 '23 07:05 zhewei-mt

Do all joints share the same origin?

The answer is yes if you're asking for the positions because all the joints are in the same coordinate system. Rotations are local rotations that represent rotation to the parent joint.

Is there any way to convert the ratation information to be compatiable SMPL?

No, as far as I know. It is not a simple procedure because the dataset only has poses, not shape.

youngwoo-yoon avatar May 15 '23 07:05 youngwoo-yoon

Thanks for your reply. My bad. I am not asking for shape parameter of SMPL. My purpose is to drive avatar in UE5. I expect that I should do some preprocess so that UE is able to "recognize" your output. Do you have any expertise in this field?

zhewei-mt avatar May 15 '23 08:05 zhewei-mt

I usually used Blender for the animation after putting the output values into BVH file. This might helps you: https://github.com/TeoNikolov/genea_visualizer Unfortunately I do not have experiences with UE5.

youngwoo-yoon avatar May 15 '23 08:05 youngwoo-yoon

Thanks! Two more questions.

  1. For 18 joints setting, is there any picture to visualize all joints positions? I dont understand the difference between 'b_l_wrist_twist' and 'b_l_wrist', and visualization helps a lot, if any.
  2. What is the initial body pose, e.g. a-pose or t-pose?

zhewei-mt avatar May 16 '23 07:05 zhewei-mt

  1. When you import a BVH file in Blender, you can visually check joints and their names.
  2. T-pose. Some technical details for the retargeting to T-pose are in https://arxiv.org/pdf/2303.08737.pdf

youngwoo-yoon avatar May 17 '23 01:05 youngwoo-yoon

For coordinate system of each joint, if I get it correct, when facing out of the screen, the direction would be like this: -y up and z forward 9B9F5072-20A3-4b3c-BFD4-4CA2C7E27F45 Please correct me if I am wrong. Thanks!

zhewei-mt avatar Jun 27 '23 02:06 zhewei-mt