cddfm3d icon indicating copy to clipboard operation
cddfm3d copied to clipboard

Failed to train the APNet

Open lubovbyc opened this issue 2 years ago • 16 comments

Following the default procedure and parameter settings, I cannot train the APNet successfully. For different latents input, the network always produces the same result, e.g. mean face. 00001253

I tried to print the output of each layer (as shown below). It seems the network has already collapsed. image

lubovbyc avatar May 10 '22 08:05 lubovbyc

@lubovbyc It seems the optimization collapsed. I'm not sure. Have you tried other losses, such as the PDC loss, VDC loss ? You may combine all losses or try another optimizer. If still bugging you, please share your generated data with me, I will try it on my server.

cassiePython avatar May 12 '22 09:05 cassiePython

@cassiePython Thanks for your reply!

Have you tried other losses, such as the PDC loss, VDC loss ? You may combine all losses or try another optimizer.

Yes. I have tried different losses a few times with different weights. Besides, I also attempted to decrease the initial learning rate. But it seems that all these attempts could only slow up the process of collapsing.

I'm not sure whether I missed some important details. In normal cases, is the training of APNet stable? I have uploaded the generated data to google drive. Please help check it when you are available. Thanks a lot!

lubovbyc avatar May 12 '22 14:05 lubovbyc

@lubovbyc Thanks for your sharing. I will check it immediately after the Sig Asia submission. Thanks for your patience.

cassiePython avatar May 17 '22 05:05 cassiePython

@lubovbyc Hi. I have tried to alleviate this. You can add more MLP layers for the APNet and use the PDC loss to improve robustness.

cassiePython avatar May 23 '22 00:05 cassiePython

@cassiePython Thanks for your reply! I will take a shot and check if working.

lubovbyc avatar May 23 '22 06:05 lubovbyc

@lubovbyc Hi. I have tried to alleviate this. You can add more MLP layers for the APNet and use the PDC loss to improve robustness. Can you provide the details of change to avoid the collapse. I also get the mean face according to the default code.

roundchuan avatar Jun 06 '22 09:06 roundchuan

@cassiePython Thanks for your reply! I will take a shot and check if working.

Do you get the correct results of APNet by the advice of author? I meet the trouble same as you

roundchuan avatar Jun 06 '22 09:06 roundchuan

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

hxngiee avatar Jun 06 '22 14:06 hxngiee

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw.

Besides, first, please try adding the PDC loss.

cassiePython avatar Jun 06 '22 16:06 cassiePython

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

  1. Before using the pseudo gt to train the APNet, I have tried the combination of the rendered loss and the landmark loss. Unfortunately, I fail to train the APNet.
  2. But I have not tried the WPDC loss plus the rendered loss and the landmark loss.

cassiePython avatar Jun 06 '22 16:06 cassiePython

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw.

Besides, first, please try adding the PDC loss.

Using the model and data provided by you, I still get the same face for different latent code. However, there is a change that the render face is not the mean face using your model.

roundchuan avatar Jun 07 '22 03:06 roundchuan

@roundchuan Can you get results like this:

image

cassiePython avatar Jun 07 '22 03:06 cassiePython

@roundchuan Can you get results like this:

image

@roundchuan Can you get results like this:

No , all the render faces are the same. And I print the "param_lst" the output of APNet are all the same.

roundchuan avatar Jun 07 '22 03:06 roundchuan

I also struggle in finding the good results of APNet. Should I use the renderer, landmark loss as well?

  1. Before using the pseudo gt to train the APNet, I have tried the combination of the rendered loss and the landmark loss. Unfortunately, I fail to train the APNet.
  2. But I have not tried the WPDC loss plus the rendered loss and the landmark loss. tt

The results are just like this follow your data and checkpoints.

roundchuan avatar Jun 07 '22 04:06 roundchuan

@lubovbyc I am still confused with the problem. I did not find this problem on my dataset. I attached the dataset with 4K images and the corresponding checkpoint. Please check whether it works on your device: https://portland-my.sharepoint.com/:u:/g/personal/cwang355-c_my_cityu_edu_hk/EWMWjP8DHtpEqvhLtZTyfr0BerKDVixlbx8zApUS3QTngA?e=GXsNKw. Besides, first, please try adding the PDC loss.

Using the model and data provided by you, I still get the same face for different latent code. However, there is a change that the render face is not the mean face using your model.

@cassiePython Thanks for your kind reply. can you attach constants.pkl as well? It seems that the file is omitted

hxngiee avatar Jun 07 '22 05:06 hxngiee

TO ALL: Recently, I generate more datasets with different sizes (e.g. 4K, 6K, 8K, 1W, 1.5W, 2W), and find a local minimum (e.g. the mean face) will appear during training the APNet all the time on some datasets. Before fixing this issue, you can use my pre-trained model first: https://drive.google.com/drive/folders/1qNvRu8vLPD278FW7GS-I9p6-yxYhKZY9

I am trying to:

  1. Add more data to make the latent space compact.
  2. Add the rendering loss and landmark loss, following StyleRig and traditional face reconstruction methods.

cassiePython avatar Jun 16 '22 00:06 cassiePython