S2HAND icon indicating copy to clipboard operation
S2HAND copied to clipboard

Unable to reproduce the results and pretrained model loading failure

Open YiLin32Wang opened this issue 2 years ago • 1 comments

Hi Yujin,

Thank you so much for sharing the codes and also nice work on the CVPR paper! The idea really interests me a lot, but currently I come across troubles during reproducing the results, either evaluating the pretrained model or training from the start.

I'm using Pytorch1.7.1 on CUDA 10.2 for running the codes. Therefore, the major changes I made is revising the non-static forward/backward methods in laplacianloss.py to static methods, since the non-static methods for autograd is deprecated after Pytorch1.5. And changes are shown in the pull request. The issues I have are described in the timeline of several attempts:

  1. Tried to directly evaluate the pretrained model, but got loading failure of module "renderer_NR". Seems that the module name of the neural renderer that is used in the open source codes("renderer_NR") is inconsistent with the one in 'load_model'/'save_model' function("renderer"), so that the pretrained NR module cannot be loaded saved in the current version. So I revised the module name in 'load model'/'save_model' for future loading and saving.

  2. Tried to used the pretrained model(without the NR module that is unable to load) to do the finetune training with parameters you've given in the paper (initial lr=0.00025, *0.5 every 30 epochs and all the hyper-parameters unchanged), but got rendered silhouette growing dramatically, as shown above and then collapsing to a point after only 3-4 epochs. image

  3. Tried to finetune the pretrained model with decreased the texture training hyper-parameter (lambda_texture 0.005-> 0.003, lambda_tex_reg 0.01-> 0.005) and without revised laplacian loss, and then got a more reasonable result with quantitative results shown below but this still didn't reach the presented results in paper. 0000000 0000100 image

  4. Tried to finetune along with the revised laplacian loss, got much worse results than 3. 0000000 0000100

  5. Tried to train without pretrained model with stage-wise training. First, train 3D reconstruction network only with initial lr=0.001(*0.5 every 30 epochs) for 90 epochs: 0000000 0000100 Then train only the neural renderer and encoder for using lr = 0.001 for 30 epochs: 0000000 0000100 But then after 60 epochs finetune, there exists joint output but there are no rendered images shown : 0000000 0000100

Did I make some mistakes or have some misunderstanding on the paper? I would really appreciate it if you can help me out here by giving some insights.

YiLin32Wang avatar Feb 15 '22 20:02 YiLin32Wang

Hi. Sorry for the late reply.

For 1, I don't think we need to reload/train the neural renderer. As far as my understanding, it is used to make photometric help with geometry learning. So any differentiable renderer can be used here. For 2 and 4, I suggest stage-wise training instead of training altogether. https://github.com/TerenceCYJ/S2HAND/issues/4 For others, I am not quite sure where the problem is, did you also use the 2D network?

TerenceCYJ avatar Mar 05 '22 19:03 TerenceCYJ