Hi mks0610! I am reproduce your paper but i am stuck at got 2D joints images from 3D joint coordinates Jˆ3D to the image plane with predicted camera parameters. Could u public this function and λwild,θ parameter for training SMPL model Many thanks

May 27 '22 05:05 tranducanhbk

Please refer to https://github.com/mks0601/Hand4Whole_RELEASE/blob/7c3b7fd6ac3e0824210796a2267fe7f8fdc5e968/main/model.py#L77

May 28 '22 00:05 mks0601

How about λwild,θ to caculate smpl parameter loss. Can u tell clearly about it? Many thanks your kindly response

May 28 '22 03:05 tranducanhbk

I set it to 1 when calculating the loss on 256x256 image space.

May 28 '22 16:05 mks0601

Thank you, You are very kind and i have one more question. This is my regularizer loss function class RegularizeLoss(nn.Module): def init(self): super(RegularizeLoss, self).init() def forward(self, pose): loss = torch.pow(pose,2) loss = torch.sum(loss, dim=1) return loss pose is only 3D body rotations or include 3D body global rotation

May 29 '22 15:05 tranducanhbk

I did mean instead of sum in here: loss = torch.sum(loss, dim=1) Also, 3D global rotation should be exlcluded

May 29 '22 17:05 mks0601

I trained with own dataset Format MSCOCO (no face and Hand box) 7 epochs. The convergence of 2D and regularizer is 1.2 and 0.45. I used RotationNet from this link to pridict 3D pseudo-GTs https://github.com/mks0601/Hand4Whole_RELEASE/blob/7c3b7fd6ac3e0824210796a2267fe7f8fdc5e968/common/nets/module.py#L24 And this is result. 2D joint in images obtained by 3D joint coordinates and predicted camera parameters look good but 3D render image is bad. This is 2D result AAA this is 3D result render_original_img_body Please tell me what i mistake. Thank you so much

May 30 '22 06:05 tranducanhbk

I don't know your exact training settongs, but I guess you should increase l2 reg weight

May 30 '22 07:05 mks0601

I follow your Hand4Whole training setup https://github.com/mks0601/Hand4Whole_RELEASE. https://github.com/mks0601/Hand4Whole_RELEASE/blob/7c3b7fd6ac3e0824210796a2267fe7f8fdc5e968/main/config.py#L50 this is forward code to get 3D Gts predict if cfg.parts == 'body': img_feat = self.forward_backbone(inputs, self.backbone) joint_img = smpl.reduce_joint_set(targets['joint_img']) smpl_root_pose, smpl_body_pose, smpl_shape, cam_trans = self.forward_rotation_net(img_feat, joint_img, self.rotation_net) joint_proj, joint_cam, mesh_cam = self.get_coord({'root_pose': smpl_root_pose, 'body_pose': smpl_body_pose, 'shape': smpl_shape, 'cam_trans': cam_trans}, mode) smpl_body_pose = smpl_body_pose.view(-1,(smpl.orig_joint_num-1)*3) smpl_pose = torch.cat((smpl_root_pose, smpl_body_pose),1) and this is loss caculate : loss['joint_proj'] = self.coord_loss(joint_proj, targets['joint_img'][:,:,:2], meta_info['joint_trunc']) loss['regularizer'] = self.regular_loss(smpl_body_pose)

this is backward: loss = {k:loss[k].mean() for k in loss} sum(loss[k] for k in loss).backward()

Anyway i will increase L2 reg weight (set to 2 ) and test again.

May 30 '22 08:05 tranducanhbk

After set L2 reg weight =1 and caculate 2D loss with (x,y) cordinate resize to size (8x8) i got this results render_original_img_body a little bit better but not enogh good. Do u have any idead for this issue?

May 31 '22 16:05 tranducanhbk

Did you apply the L2 reg on the SMPL shape parameter? I applied L2 reg with weight 1 to the SMPL shape parameter (beta)

May 31 '22 16:05 mks0601

Also, I calculated 2D joint loss in 256x256 space.

May 31 '22 16:05 mks0601

I did not apply L2 reg on the SMPL shape parameter, I only caculaye loss with 3D body rotations (θ). i will try add SMPL shape parameter to caculate loss. Thanks you so much

Jun 01 '22 03:06 tranducanhbk

Could u share to me the loss function and forward of RotationNet. I am still reach to good results. Many thanks you

Jun 01 '22 13:06 tranducanhbk

You can see this repo: https://github.com/mks0601/Hand4Whole_RELEASE

Jun 01 '22 16:06 mks0601

I alread follow from this repo but there are 2 things i don't clear: https://github.com/mks0601/Hand4Whole_RELEASE/blob/2f7a608cb05cf586d1e80c76e507d0802a6c13f0/main/config.py#L50

you calculated 2D joint loss in 256x256 space. so self.output_hm_shape = (8, 8, 6) need to change to (256,256,256) right?
because i use 2D image so the input don't have Z axis so what value i need to set for Z axis of input before feed it to Rotation forwad funtion https://github.com/mks0601/Hand4Whole_RELEASE/blob/2f7a608cb05cf586d1e80c76e507d0802a6c13f0/common/nets/module.py#L58

Jun 02 '22 03:06 tranducanhbk

For the network of NeuralAnnot, I calculated the 2D loss in 256x256 space. For the network of Hand4Whole, I calcualted the 2D loss in 8x8. The networks of two works are different.
joint_coord_img is estimated from PositionNet.

Jun 02 '22 05:06 mks0601

I strictly follow what you guide but can not get good result. Below are losses i caculate for backward loss['joint_img'] = self.coord_loss(joint_img, smpl.reduce_joint_set(targets['joint_img'])/32., smpl.reduce_joint_set(meta_info['joint_trunc']), meta_info['is_3D']) for 2d loss of positionNet (in 8x8 space) loss['joint_proj'] = self.coord_loss(joint_proj, targets['joint_img'][:,:,:2], meta_info['joint_trunc']) for 2D loss of RatationNet (in 256x256 sapce loss['regularizer_body'] = self.regular_loss(smpl_body_pose) loss['regularizer_shape'] = self.regular_loss(smpl_shape)

Class RegularizeLoss(nn.Module): def init(self): super(RegularizeLoss, self).init()

def forward(self, smpl_para):
    loss = torch.pow(smpl_para,2)
    return loss

class CoordLoss(nn.Module): def init(self): super(CoordLoss, self).init()

def forward(self, coord_out, coord_gt, valid, is_3D=None):
    loss = torch.abs(coord_out - coord_gt) * valid
    if is_3D is not None:
        loss_z = loss[:,:,2:] * is_3D[:,None,None].float()
        loss = torch.cat((loss[:,:,:2], loss_z),2)
    return loss

Please tell me what issue i got. this is results render_original_img_body

Jun 02 '22 15:06 tranducanhbk

Are you using VPoser? As written in the paper, VPoser is helpful.

Jun 02 '22 16:06 mks0601

I did not use VPoser. Maybe it is big misktake i got. Thank you for your advise

Jun 03 '22 04:06 tranducanhbk

I used Vposer to decode SMPL pose from 32 dimenstions but still not got good results. Below is code to get root_pose, body_pose, shape_param, cam_param

predict 32 dismention for vposer

self.vposer_out = make_linear_layers([2048, 32], relu_final=False)

predict 63 body pose of smpl from vposer 32 vector

self.vposer, _ = load_model(cfg.vposer_expr_dir, model_code=VPoser,remove_words_in_model_weights='vp_model.', disable_grad=True)

predict global rotaions of smpl

self.root_pose_out = make_linear_layers([2048, 3], relu_final=False)

predict shape parameters of smpl

self.shape_out = make_linear_layers([2048,smpl.shape_param_dim], relu_final=False) Forward fuction. #shape_param shape_param = self.shape_out(img_feat.mean((2,3)))

camera parameter

cam_param = self.cam_out(img_feat.mean((2,3)))

vposer para

vposer_para = self.vposer_out(img_feat.mean((2,3)))

body pose

body_pose = self.vposer.decode(vposer_para)['pose_body'].contiguous().view(-1, 63)

root pose

root_pose = self.root_pose_out(img_feat.mean((2,3))) return root_pose, body_pose, shape_param, cam_param

Loss function loss['joint_proj'] = self.coord_loss(joint_proj, targets['joint_img'][:,:,:2], meta_info['joint_trunc']) loss['regularizer_body'] = self.regular_loss(smpl_body_pose) loss['regularizer_shape'] = self.regular_loss(smpl_shape)

I already caculate 2D loss in 256x256 space and set L2 reg =1 but loss converges around loss_joint_proj: 7.6711 loss_regularizer_body: 1.4312 loss_regularizer_shape: 0.0089

render_original_img_body I wonder what i got mistake now

Jun 05 '22 02:06 tranducanhbk

I check in Vposer doc: "On the other hand, body poZ, VPoser's latent space representation for SMPL body, has in total 32 elements with a spherical Gaussian distribution. This means if one samples a 32 dimensional random vector from a Normal distribution and pass it through VPoser's decoder the result would be a viable human joint configuration in axis-angle representation" I told Output of Linear layer should follow Gaussian distribution, can u share function for this

Jun 06 '22 05:06 tranducanhbk

You should apply the L2 regularizer to the vposer code. All my previous and current answers are included in the paper. Please read the paper carefully for your implementation.

Jun 06 '22 12:06 mks0601

Finally i got better result correct but some case it still wrong wrong

Jun 08 '22 05:06 tranducanhbk

You'd better use SMPL instead of SMPLX as you do not have hand/face annotations. Also, try stronger L2 regularizer weight.

Jun 08 '22 05:06 mks0601

I am using SMPL. Stronger L2 regularizer weight mean increase L2 regularizer weight right?

Jun 08 '22 06:06 tranducanhbk

Yes. As you can see qualitative results of the papers or provided 3D pseudo-GTs of MSCOCO in this repo, the 3D pseudo-GTs are quite good. There should be some problem in your implementation :(

Jun 08 '22 06:06 mks0601

Yes, I think so. I strickly follow your paper and now i don't know what problem I got. If u can share some log of training and detail model architect it will helpful for me

Jun 08 '22 07:06 tranducanhbk

this is source code base on your HAND4WHOLE ( now only for body) https://github.com/tranducanhbk/NeuralAnnot if u find any issue please tell me

Jun 08 '22 09:06 tranducanhbk

Could u show some training log?

Jun 09 '22 11:06 tranducanhbk

Finally i can got good result. Your paper now mention about feed one images for one time generate 3D GT. 0_render_original_img_body

Jun 10 '22 11:06 tranducanhbk

NeuralAnnot_RELEASE
NeuralAnnot_RELEASE copied to clipboard

Function projecting 3D joint coordinates Jˆ3D to the image plane with predicted camera parameters.

predict 32 dismention for vposer

predict 63 body pose of smpl from vposer 32 vector

predict global rotaions of smpl

predict shape parameters of smpl

camera parameter

vposer para

body pose

root pose

NeuralAnnot_RELEASE NeuralAnnot_RELEASE copied to clipboard

Function projecting 3D joint coordinates Jˆ3D to the image plane with predicted camera parameters.

predict 32 dismention for vposer

predict 63 body pose of smpl from vposer 32 vector

predict global rotaions of smpl

predict shape parameters of smpl

camera parameter

vposer para

body pose

root pose

NeuralAnnot_RELEASE
NeuralAnnot_RELEASE copied to clipboard