X-Avatar
X-Avatar copied to clipboard
Creating a new avatar
Hi, Awesome work! I have a question, How can I create a new avatar from an image or video? and apply the .pkl file motion on it.
Thanks for your interest.Our work can only create avatars from either 3D posed scans or RGB-D video. Currently, it does not work with single image or RGB video.
The format of avatars for 3d avatars includes .pth file and .npz file, assuming that I have a 3d avatar with the format of obj how can I prepare it to be used by x-avatar? Should I have a sequence of 3d poses or static 3d pose?
Hi, I am a little bit confused. So do you want to test our pretrained 3D avatars or to train your personal avatars?
If you want to test our avatars, you can use our pretrained model in the format of .pth and .npz, and run it on the .pkl motion.
If you want to train your own models and you already have human meshes in .obj format, you can convert the model from .obj to .ply. To train our X-Avatars, you should have a sequence of 3D poses, i.e., SMPL-X parameters corresponding to the human meshes. You strongly suggest you download our X-Humans dataset to better understand the structure of our data for training X-Avatars.
Hi, Thanks for your reply, but I am still confused.
The file structure in x-human dataset:
└── Sequence ID/
├── meshes_pkl/
│ ├── atlas-fxxxxx.pkl: low-res textures as pickle files (1024, 1024, 3)
│ └── mesh-fxxxxx.pkl: 'vertices', 'normals', 'uvs', 'faces'
├── render/
│ ├── depth/
│ │ └── depth_xxxxxx.tiff, depth image
│ ├── image/
│ │ └── color_xxxxxx.png, RGB image
│ └── cameras.npz: intrinsic and extrinsic of the virtual camera
├── SMPL/
│ ├── mesh-fxxxxx_smpl.pkl: SMPL params 'global_orient' (3,), 'transl' (3,), 'body_pose' (69,), 'betas' (10,) (use mean_shape_smpl.npy instead)
│ └── mesh-fxxxxx_smpl.ply: SMPL meshes
└── SMPLX/
├── mesh-fxxxxx_smplx.pkl: SMPL-X params 'global_orient' (3,), 'transl' (3,), 'body_pose' (63,), 'left_hand_pose' (45,), 'right_hand_pose' (45,),
│ 'jaw_pose' (3,), 'leye_pose' (3,), 'reye_pose' (3,), 'expression' (10,), 'betas' (10,) (use mean_shape_smplx.npy instead)
└── mesh-fxxxxx_smplx.ply: SMPL-X meshes
- If I want to create and train a new avatar following the same structure what are the steps?
- how to generate atlas-fxxxxx.pkl and mesh-fxxxxx.pkl?
- how to generate render and smplx folder?
Thanks in advance.
Hi, I think we're not on the same channel. Our method doesn't allow to create X-Avatar from a single .obj file which usually cannot be called as "avatar" since it's static. I think what you want to have is the rigging technique that embeds skeletons to your obj file.
Regarding the dataset structure, these files are components of our textured scans (atlas, mesh, etc.). It's not very easy to produce from your side since we rely on a dense camera rig and commercial software to get these static textured scans.
What about use monocular depth estimation technique to create RGB-D data from RGB video? I'm not sure the performance of current monocular depth estimation technique, and another problem is the camera parameter of depth image need to be aligned with the smplx model. But the RGB-D data quality may seriously decrease compared with X-humans so I'm not sure if it works. Do you think it is a feasible method? Thanks. @MoyGcc @Skype-line
I actually doubt it. Current monocular depth estimation usually doesn't give an absolute scale but only relative depth values. You can try to obtain the absolute scale by aligning the fitted SMPL model with the depth maps since SMPL can give you the rough height and thickness of humans. We had a avatar paper from last year's CVPR which reconstructs the avatar (SMPL-driven only) from a real Kinect RGBD sequence. But the quality is apparently not as good as X-Avatar. The monocular estimated depth might be even worse compared to Kinect outputs.
How about using multiple depth cameras between "depth" and "monocular" that can also be enough to cover the human? For example, the FOV of zed is 120 degrees. Thus, using 3 or 4 zed cameras around the human to capture their pose in sequence may be also viable?
Hi, I think we're not on the same channel. Our method doesn't allow to create X-Avatar from a single .obj file which usually cannot be called as "avatar" since it's static. I think what you want to have is the rigging technique that embeds skeletons to your obj file.
Regarding the dataset structure, these files are components of our textured scans (atlas, mesh, etc.). It's not very easy to produce from your side since we rely on a dense camera rig and commercial software to get these static textured scans.
Hi, This is a great work! I am wondering if I can use an iPhone scanner to obtain RGB-D videos. And what should I do to obtain the data in smplx and mesh folders?