character-motion-vaes icon indicating copy to clipboard operation
character-motion-vaes copied to clipboard

How to generate mocap.npz?

Open Minotaur-CN opened this issue 3 years ago • 15 comments

Hi, How to generage mocap.npz, which seems not easy to me. Can u give a clue how to generate mocap.npz from public mocap dataset?

The train_mvae.py script assumes the mocap data to be at environments/mocap.npz. The original training data is not included in this repo; but can be easily extracted from other public datasets.

Thanks very much!

BEST

Minotaur-CN avatar Mar 14 '21 08:03 Minotaur-CN

+1, could you provide some details and format description about the mocap data? Thanks!

@fabiozinno @belinghy

wangshub avatar Apr 07 '21 09:04 wangshub

+1, could you provide some details and format description about the mocap data? Thanks!

OOF-dura avatar Jun 20 '21 06:06 OOF-dura

@Minotaur-CN @OOF-dura

the raw data format is not complex, just read the code below

  • https://github.com/electronicarts/character-motion-vaes/blob/main/vae_motion/train_mvae.py#L140
  • https://github.com/electronicarts/character-motion-vaes/blob/main/vae_motion/train_mvae.py#L141

wangshub avatar Jun 21 '21 11:06 wangshub

Below is some information about the data format. We also note the length of each mocap sequence (as mentioned above at L141). This is so we don't sample invalid transitions for training. If the mocap clip is one long continuous sequence, then there no reason to do this.

    0-3 : root delta x, delta y, delta facing
   3-69 : joint coordinates (22 * 3 = 66)
 69-135 : joint velocities in Cartesian coordinate in previous root frame (22 * 3 = 66)
135-267 : 6D joint orientations, i.e. first two columns of rotation matrix (22 * 6 = 132)

For extracting training data from mocap datasets, I think fairmotion might be helpful. Based on the examples I have seen, though I haven't tested, should be something like below. Root deltas need some more processing; essentially find the displacement vector and rotate by the current facing direction of the character. Same thing for positions and velocities, they should be projected to the character space to make learning easier.

from fairmotion.data import bvh

motion = bvh.load(BVH_FILENAME)

positions = motion.positions(local=False)  # (frames, joints, 3)
velocities = positions[1:] - positions[:-1]
orientations = motion.rotations(local=False)[..., :, :2].reshape(-1, 22, 6)

belinghy avatar Jun 21 '21 18:06 belinghy

@belinghy can you tell me more about how to get root deltas?

I think a sample, formula or code would be better

Realdr4g0n avatar Jun 29 '21 11:06 Realdr4g0n

I may have misunderstood the whole process but since there aren't any sample of mocap.npz, I assume that mocap.npz should be like this:

mocap.npz "data": list of 267 float numbers (the first info about root delta is included in paper "pose representation") * frame "end_indices": 267? ( length of each mocap sequence )

It seems mocap data has to include only 22 joints, so, extracting from other public datasets may not work as bvh files or other mocap data out there may have different number of joints. Even if discarding irrelevant joints from the data, joint index order is another issue as you can see in mocap_env.py :(

Therefore.. I think there are two ways to solve:

  • modify joint relevant parts in this project (mocap_env.py , etc.)
  • or discard joints and keep joint index sequence refer to pose0.npy ( not sure about this )

I wasn't able to find what mocap database this project had used. and it wasn't in the paper..:(

ameliacode avatar Apr 20 '22 17:04 ameliacode

Your understanding of the format is correct, except end_indices marks the end of mocap clips. It depends on the number of mocap clips you have, so not necessarily 267. For example, if there are two clips of lengths 10 and 15, then end_indices = np.cumsum([10, 15]) - 1 = [9, 24].

As you've noted, mocap_env.py could definitely refactored. I think the only things to change if you are using different input format are these lines and these lines. The second reference is only if 0-3 : root delta x, delta y, delta facing. Am I missing anything else?

belinghy avatar Apr 20 '22 20:04 belinghy

So.. as it mentioned above, if I get this right, end_indices might contain one integer value if an input clip is a long continuous sequence, However, I still don't get what "length" is in this case. Is it a frame number? or is there other unit used?

ameliacode avatar Apr 21 '22 12:04 ameliacode

Yes, it's a frame number. end_indices contains one integer value if there is exactly one input clip is a long continuous sequence.

belinghy avatar Apr 21 '22 18:04 belinghy

Hi, I have some confusion about 135-267 : 6D joint orientations, i.e. first two columns of rotation matrix (22 * 6 = 132) orientations = motion.rotations(local=False)[..., :, :2].reshape(-1, 22, 6) in your case, is 'z-axis' the world up vector, and 6D joint orientations are the orientations of other two directions?

Furthermore, can you provide some examples for 0-3 : root delta x, delta y, delta facing? I am a bit confused about the definition of these variables.

Thank you

edentliang avatar Apr 27 '22 14:04 edentliang

Maybe this will help: https://arxiv.org/pdf/2103.14274.pdf : see pose representation for root information.

I think the paper and code is slightly different in terms of what up-vector they have used. Overall, root delta position has to include two values of root position projected on the ground, root facing direction also, and joint orientations have to include a form of rotation matrix of relative forward and upward vector.

ameliacode avatar Apr 28 '22 05:04 ameliacode

Hello,

@belinghy , when reshaping the rotation components, as returned by fairmotion: orientations = motion.rotations(local=False)[..., :, :2].reshape(-1, 22, 6), how would the vector components be distributed? Considering the first joint in the first frame and the 6 on the third dimension contains 3 components for the first 2 columns of the rotation matrix, would they be laid out like in the first version below or as in the second one?

version 1:

orientations[0, 0, 0] = comp1_x
orientations[0, 0, 1] = comp1_y
orientations[0, 0, 2] = comp1_z
orientations[0, 0, 3] = comp2_x
orientations[0, 0, 4] = comp2_y
orientations[0, 0, 5] = comp2_z

version 2:

orientations[0, 0, 0] = comp1_x
orientations[0, 0, 1] = comp2_x
orientations[0, 0, 2] = comp1_y
orientations[0, 0, 3] = comp2_y
orientations[0, 0, 4] = comp1_z
orientations[0, 0, 5] = comp2_z

Gabriel-Bercaru avatar Jul 28 '22 14:07 Gabriel-Bercaru

Hi @Gabriel-Bercaru, I'm not sure what is fairmotion's convention. Are you rendering the character using joint orientations? If not, for the purpose of neural network input, the order shouldn't matter.

belinghy avatar Jul 28 '22 18:07 belinghy

Hello, indeed for the input training data, it doesn't really matter, but I was trying to render a mesh over a trained model.

As far as I have seen, rigging makes use of the joint orientations and in order to get them I should convert those 6D orientation vectors to either Euler rotations or quaternions

Gabriel-Bercaru avatar Jul 28 '22 19:07 Gabriel-Bercaru

The way it's indexed, e.g., [..., :, :2], should correspond to version 2.

belinghy avatar Jul 28 '22 19:07 belinghy