NeuralAnnot_RELEASE
NeuralAnnot_RELEASE copied to clipboard
Function projecting 3D joint coordinates Jˆ3D to the image plane with predicted camera parameters.
Hi mks0610! I am reproduce your paper but i am stuck at got 2D joints images from 3D joint coordinates Jˆ3D to the image plane with predicted camera parameters. Could u public this function and λwild,θ parameter for training SMPL model Many thanks
Please refer to https://github.com/mks0601/Hand4Whole_RELEASE/blob/7c3b7fd6ac3e0824210796a2267fe7f8fdc5e968/main/model.py#L77
How about λwild,θ to caculate smpl parameter loss. Can u tell clearly about it? Many thanks your kindly response
I set it to 1 when calculating the loss on 256x256 image space.
Thank you, You are very kind and i have one more question. This is my regularizer loss function class RegularizeLoss(nn.Module): def init(self): super(RegularizeLoss, self).init() def forward(self, pose): loss = torch.pow(pose,2) loss = torch.sum(loss, dim=1) return loss pose is only 3D body rotations or include 3D body global rotation
I did mean instead of sum in here: loss = torch.sum(loss, dim=1) Also, 3D global rotation should be exlcluded
I trained with own dataset Format MSCOCO (no face and Hand box) 7 epochs.
The convergence of 2D and regularizer is 1.2 and 0.45.
I used RotationNet from this link to pridict 3D pseudo-GTs https://github.com/mks0601/Hand4Whole_RELEASE/blob/7c3b7fd6ac3e0824210796a2267fe7f8fdc5e968/common/nets/module.py#L24
And this is result.
2D joint in images obtained by 3D joint coordinates and predicted camera parameters look good but 3D render image is bad.
This is 2D result
this is 3D result
Please tell me what i mistake.
Thank you so much
I don't know your exact training settongs, but I guess you should increase l2 reg weight
I follow your Hand4Whole training setup https://github.com/mks0601/Hand4Whole_RELEASE. https://github.com/mks0601/Hand4Whole_RELEASE/blob/7c3b7fd6ac3e0824210796a2267fe7f8fdc5e968/main/config.py#L50 this is forward code to get 3D Gts predict if cfg.parts == 'body': img_feat = self.forward_backbone(inputs, self.backbone) joint_img = smpl.reduce_joint_set(targets['joint_img']) smpl_root_pose, smpl_body_pose, smpl_shape, cam_trans = self.forward_rotation_net(img_feat, joint_img, self.rotation_net) joint_proj, joint_cam, mesh_cam = self.get_coord({'root_pose': smpl_root_pose, 'body_pose': smpl_body_pose, 'shape': smpl_shape, 'cam_trans': cam_trans}, mode) smpl_body_pose = smpl_body_pose.view(-1,(smpl.orig_joint_num-1)*3) smpl_pose = torch.cat((smpl_root_pose, smpl_body_pose),1) and this is loss caculate : loss['joint_proj'] = self.coord_loss(joint_proj, targets['joint_img'][:,:,:2], meta_info['joint_trunc']) loss['regularizer'] = self.regular_loss(smpl_body_pose)
this is backward: loss = {k:loss[k].mean() for k in loss} sum(loss[k] for k in loss).backward()
Anyway i will increase L2 reg weight (set to 2 ) and test again.
After set L2 reg weight =1 and caculate 2D loss with (x,y) cordinate resize to size (8x8) i got this results
a little bit better but not enogh good.
Do u have any idead for this issue?
Did you apply the L2 reg on the SMPL shape parameter? I applied L2 reg with weight 1 to the SMPL shape parameter (beta)
Also, I calculated 2D joint loss in 256x256 space.
I did not apply L2 reg on the SMPL shape parameter, I only caculaye loss with 3D body rotations (θ). i will try add SMPL shape parameter to caculate loss. Thanks you so much
Could u share to me the loss function and forward of RotationNet. I am still reach to good results. Many thanks you
You can see this repo: https://github.com/mks0601/Hand4Whole_RELEASE
I alread follow from this repo but there are 2 things i don't clear: https://github.com/mks0601/Hand4Whole_RELEASE/blob/2f7a608cb05cf586d1e80c76e507d0802a6c13f0/main/config.py#L50
- you calculated 2D joint loss in 256x256 space. so self.output_hm_shape = (8, 8, 6) need to change to (256,256,256) right?
- because i use 2D image so the input don't have Z axis so what value i need to set for Z axis of input before feed it to Rotation forwad funtion https://github.com/mks0601/Hand4Whole_RELEASE/blob/2f7a608cb05cf586d1e80c76e507d0802a6c13f0/common/nets/module.py#L58
- For the network of NeuralAnnot, I calculated the 2D loss in 256x256 space. For the network of Hand4Whole, I calcualted the 2D loss in 8x8. The networks of two works are different.
- joint_coord_img is estimated from PositionNet.
I strictly follow what you guide but can not get good result. Below are losses i caculate for backward loss['joint_img'] = self.coord_loss(joint_img, smpl.reduce_joint_set(targets['joint_img'])/32., smpl.reduce_joint_set(meta_info['joint_trunc']), meta_info['is_3D']) for 2d loss of positionNet (in 8x8 space) loss['joint_proj'] = self.coord_loss(joint_proj, targets['joint_img'][:,:,:2], meta_info['joint_trunc']) for 2D loss of RatationNet (in 256x256 sapce loss['regularizer_body'] = self.regular_loss(smpl_body_pose) loss['regularizer_shape'] = self.regular_loss(smpl_shape)
Class RegularizeLoss(nn.Module): def init(self): super(RegularizeLoss, self).init()
def forward(self, smpl_para):
loss = torch.pow(smpl_para,2)
return loss
class CoordLoss(nn.Module): def init(self): super(CoordLoss, self).init()
def forward(self, coord_out, coord_gt, valid, is_3D=None):
loss = torch.abs(coord_out - coord_gt) * valid
if is_3D is not None:
loss_z = loss[:,:,2:] * is_3D[:,None,None].float()
loss = torch.cat((loss[:,:,:2], loss_z),2)
return loss
Please tell me what issue i got.
this is results
Are you using VPoser? As written in the paper, VPoser is helpful.
I did not use VPoser. Maybe it is big misktake i got. Thank you for your advise
I used Vposer to decode SMPL pose from 32 dimenstions but still not got good results. Below is code to get root_pose, body_pose, shape_param, cam_param
predict 32 dismention for vposer
self.vposer_out = make_linear_layers([2048, 32], relu_final=False)
predict 63 body pose of smpl from vposer 32 vector
self.vposer, _ = load_model(cfg.vposer_expr_dir, model_code=VPoser,remove_words_in_model_weights='vp_model.', disable_grad=True)
predict global rotaions of smpl
self.root_pose_out = make_linear_layers([2048, 3], relu_final=False)
predict shape parameters of smpl
self.shape_out = make_linear_layers([2048,smpl.shape_param_dim], relu_final=False) Forward fuction. #shape_param shape_param = self.shape_out(img_feat.mean((2,3)))
camera parameter
cam_param = self.cam_out(img_feat.mean((2,3)))
vposer para
vposer_para = self.vposer_out(img_feat.mean((2,3)))
body pose
body_pose = self.vposer.decode(vposer_para)['pose_body'].contiguous().view(-1, 63)
root pose
root_pose = self.root_pose_out(img_feat.mean((2,3))) return root_pose, body_pose, shape_param, cam_param
Loss function loss['joint_proj'] = self.coord_loss(joint_proj, targets['joint_img'][:,:,:2], meta_info['joint_trunc']) loss['regularizer_body'] = self.regular_loss(smpl_body_pose) loss['regularizer_shape'] = self.regular_loss(smpl_shape)
I already caculate 2D loss in 256x256 space and set L2 reg =1 but loss converges around loss_joint_proj: 7.6711 loss_regularizer_body: 1.4312 loss_regularizer_shape: 0.0089
I wonder what i got mistake now
I check in Vposer doc: "On the other hand, body poZ, VPoser's latent space representation for SMPL body, has in total 32 elements with a spherical Gaussian distribution. This means if one samples a 32 dimensional random vector from a Normal distribution and pass it through VPoser's decoder the result would be a viable human joint configuration in axis-angle representation" I told Output of Linear layer should follow Gaussian distribution, can u share function for this
You should apply the L2 regularizer to the vposer code. All my previous and current answers are included in the paper. Please read the paper carefully for your implementation.
Finally i got better result
but some case it still wrong
You'd better use SMPL instead of SMPLX as you do not have hand/face annotations. Also, try stronger L2 regularizer weight.
I am using SMPL. Stronger L2 regularizer weight mean increase L2 regularizer weight right?
Yes. As you can see qualitative results of the papers or provided 3D pseudo-GTs of MSCOCO in this repo, the 3D pseudo-GTs are quite good. There should be some problem in your implementation :(
Yes, I think so. I strickly follow your paper and now i don't know what problem I got. If u can share some log of training and detail model architect it will helpful for me
this is source code base on your HAND4WHOLE ( now only for body) https://github.com/tranducanhbk/NeuralAnnot if u find any issue please tell me
Could u show some training log?
Finally i can got good result.
Your paper now mention about feed one images for one time generate 3D GT.