eg3d icon indicating copy to clipboard operation
eg3d copied to clipboard

I think I have reached reasonable results, but there are a few questions.

Open RaymondJiangkw opened this issue 3 years ago • 40 comments

Hi,

After days of training and debugging, my reimplementation has reached following results (currently training on ~15M images):

https://user-images.githubusercontent.com/53508410/152667374-83205472-9158-4d93-886c-38391399f4e8.mp4

https://user-images.githubusercontent.com/53508410/152667375-36c82fc2-236f-477a-abd0-a7693a8e86a0.mp4

https://user-images.githubusercontent.com/53508410/152667475-31d6a910-99d8-4c28-b636-b1f2e8ee7c71.mp4

https://user-images.githubusercontent.com/53508410/152667474-c88a874b-5507-4b60-b62d-9c3ab3178d86.mp4

There are a few changes I have made: 1. I use ReLU instead of SoftMax as activation function for hidden layer in the Decoder architecture. 2. The neural rendering resolution is 128 x 128 at start. 3. Blurring images at first 200K images is disabled.

There must be something wrong with my reimplementation or understanding, since if I stick to the paper in these two aspects, my model diverges quickly. Besides, for simplicity, I use the default mixed precision for generator backbone and discriminator in StyleGAN2 and the probability of randomly swapping the conditioning pose of the generator with another random pose is always set to 50%. Do these two matter?

However, obviously there are problems with my demonstrations: 1. There are flickering and inconsistency between image at low resolution and gone through super resolution module. 2. The result performs bad at relatively large angle, which features blurring and parts' protruding.

https://user-images.githubusercontent.com/53508410/152667634-e3f2d409-f5a6-456a-bca8-65ed9950af9f.mp4

Thus, my question is how to solve these issues? Are these two normal? If not, what should I check or do to improve the results?

Thanks!

RaymondJiangkw avatar Feb 06 '22 04:02 RaymondJiangkw

The interesting fact is that after applying truncation trick, the result becomes much more better. (The left one is generated w/o truncation trick, namely setting truncation_psi to be 1, while the right one is generated w truncation trick, specifically truncation_psi is set to be 0.5. Both of them are generated from same z.)

https://user-images.githubusercontent.com/53508410/152785987-1037138a-76cf-4a9e-9c84-3a424da826ef.mp4

And it appears that mixed precision really matters. 😂

RaymondJiangkw avatar Feb 07 '22 12:02 RaymondJiangkw

Cool!! Do you mind sharing your code as a reference?

FebOne1 avatar Feb 07 '22 20:02 FebOne1

Hi FebOne1,

There are still problems with my implementation waiting to be solved. I think it is still a bit premature, thus I propose issue here for suggestions. I will release my implementation until I reach the official demonstration.

Besides, I think official code will be released soon actually.

RaymondJiangkw avatar Feb 08 '22 02:02 RaymondJiangkw

Great results! What is the fid score on FFHQ@256?

MrTornado24 avatar Feb 09 '22 05:02 MrTornado24

Currently, the fid score of FFHQ@512 is 9.75, which is still higher than the official one.

RaymondJiangkw avatar Feb 09 '22 07:02 RaymondJiangkw

Hi FebOne1,

There are still problems with my implementation waiting to be solved. I think it is still a bit premature, thus I propose issue here for suggestions. I will release my implementation until I reach the official demonstration.

Besides, I think official code will be released soon actually.

I was wondering how do you set up ray sampling after getting the estimated poses by https://github.com/microsoft/Deep3DFaceReconstruction? As they assume the camera origin is (0,0,10), how do you set the start and end range of integration? Appreciate your reply!

FebOne1 avatar Feb 12 '22 06:02 FebOne1

Hi FebOne1,

Actually, I do not follow the schema of estimating poses stated in the paper. I use FLAME and DECA to fit head model, and retrieve its pitch, yaw and roll. The precise computation of transforming coordinate systems is a bit complicated, which may not be stated clearly in text.

For your reference, and as stated in other issue, you can check ray sampling code of CIPS-3D and pi-GAN. They just need some modifications.

Kevin

RaymondJiangkw avatar Feb 12 '22 09:02 RaymondJiangkw

I also try to implement it. But I find some problems. When I only use low resolution, the synthetic background loss, which is all black. When I only use high resolution, the background is filled. However, the hat of head is bad. I guess the generator only synthesizes the volume of the face and head, drop all irrelative things about hat and background.

The following two pictures are low-resolution generated images and high-resolution generated images, respectively. Note that two pictures use different noise, which is not corresponding.

image

image

By the way, Can I add your WeChat for further contact? My wechat is "ShoutOutAjie"

shoutOutYangJie avatar Feb 14 '22 06:02 shoutOutYangJie

I also use "march cube" to reconstruct the 3d volume. I find my result has no background and hat, which is not correct with the paper. image

shoutOutYangJie avatar Feb 14 '22 06:02 shoutOutYangJie

Hi shoutOutYangJie,

Can you share your decoder architecture? I am really curious about it, since the original paper states that they use SoftMax activation. However, after many experiments, I failed, thus, I wonder there might be some misunderstandings or tricks?

Thanks! Kevin

RaymondJiangkw avatar Feb 14 '22 06:02 RaymondJiangkw

I can share my code. Can you give me your email? I want to contact with you.

shoutOutYangJie avatar Feb 14 '22 06:02 shoutOutYangJie

I can share my code. Can you give me your email? I want to contact with you.

Hi, I am also interested in your re-implementation details, could you share it with me? You can find me at [email protected]. Thanks so much!

dreamcraft3d avatar Feb 14 '22 08:02 dreamcraft3d

Hey guys, I have some questions about the camera params P(25 scalars) to be conveyed in mapping network and neural renderer block. So how can we gain the params? I'm still confused about how to use off-the-shelf pose detectors to gain the camera params. Is it the ray_directions, ray_origins, focal and some camera_to_world params? But why the 25 scalars.... @shoutOutYangJie @RaymondJiangkw

41xu avatar Feb 15 '22 03:02 41xu

4 * 4 = 16 extrinsics 3 * 3 =9 intrinsics Totally 25

FebOne1 avatar Feb 15 '22 04:02 FebOne1

I can share my code. Can you give me your email? I want to contact with you.

Hi @shoutOutYangJie, can you please share your code with me and I can use yours as a reference. My email is [email protected] Thank you much !!

longnhatne avatar Feb 17 '22 07:02 longnhatne

Hi @shoutOutYangJie, could you share your code with me? I am at [email protected]

kuldeeppurohit avatar Mar 14 '22 08:03 kuldeeppurohit

the official code will open soon

------------------ Original ------------------ From: Kuldeep Purohit @.> Date: Mon,Mar 14,2022 4:27 PM To: NVlabs/eg3d @.> Cc: shoutOutYangJie @.>, Mention @.> Subject: Re: [NVlabs/eg3d] I think I have reached reasonable results, butthere are a few questions. (Issue #9)

Hi @shoutOutYangJie, could you share your code with me? I am at @.***

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

shoutOutYangJie avatar Mar 14 '22 09:03 shoutOutYangJie

Hi,

After days of training and debugging, my reimplementation has reached following results (currently training on ~15M images):

sample0_128.mp4 sample0_512.mp4 sample1_128.mp4 sample1_512.mp4 There are a few changes I have made: 1. I use ReLU instead of SoftMax as activation function for hidden layer in the Decoder architecture. 2. The neural rendering resolution is 128 x 128 at start. 3. Blurring images at first 200K images is disabled.

There must be something wrong with my reimplementation or understanding, since if I stick to the paper in these two aspects, my model diverges quickly. Besides, for simplicity, I use the default mixed precision for generator backbone and discriminator in StyleGAN2 and the probability of randomly swapping the conditioning pose of the generator with another random pose is always set to 50%. Do these two matter?

However, obviously there are problems with my demonstrations: 1. There are flickering and inconsistency between image at low resolution and gone through super resolution module. 2. The result performs bad at relatively large angle, which features blurring and parts' protruding.

sample2_circulation.mp4 Thus, my question is how to solve these issues? Are these two normal? If not, what should I check or do to improve the results?

Thanks!

Hi, I guess the blurry texture may come from a relatively narrow pose sampling range or somehow inaccurate conditioned camera pose, since FFHQ covers a wide range of camera pose and EG3D can learn sharp texture at large angles well if the training process finally converges. My implementation looks like eg3d_reprod

MrTornado24 avatar Mar 25 '22 16:03 MrTornado24

Hi, After days of training and debugging, my reimplementation has reached following results (currently training on ~15M images): sample0_128.mp4 sample0_512.mp4 sample1_128.mp4 sample1_512.mp4 There are a few changes I have made: 1. I use ReLU instead of SoftMax as activation function for hidden layer in the Decoder architecture. 2. The neural rendering resolution is 128 x 128 at start. 3. Blurring images at first 200K images is disabled. There must be something wrong with my reimplementation or understanding, since if I stick to the paper in these two aspects, my model diverges quickly. Besides, for simplicity, I use the default mixed precision for generator backbone and discriminator in StyleGAN2 and the probability of randomly swapping the conditioning pose of the generator with another random pose is always set to 50%. Do these two matter? However, obviously there are problems with my demonstrations: 1. There are flickering and inconsistency between image at low resolution and gone through super resolution module. 2. The result performs bad at relatively large angle, which features blurring and parts' protruding. sample2_circulation.mp4 Thus, my question is how to solve these issues? Are these two normal? If not, what should I check or do to improve the results? Thanks!

Hi, I guess the blurry texture may come from a relatively narrow pose sampling range or somehow inaccurate conditioned camera pose, since FFHQ covers a wide range of camera pose and EG3D can learn sharp texture at large angles well if the training process finally converges. My implementation looks like eg3d_reprod

Hi,

Thanks for your advice, but I don't think they apply to this situation. :)

I have tested my model in low resolution, and am able to reach good result even at pi/2 side-view after training for ~3M images. The demonstration I showed here was based on the model, which had been trained for 15M images instead of 25M images described in the paper. In other words, as you have said, the training process hadn't been converged. 😂

Again, thanks for your advice!

Best, Kevin

RaymondJiangkw avatar Mar 26 '22 03:03 RaymondJiangkw

@RaymondJiangkw Could you release your implmentation codes ?

bruinxiong avatar Mar 29 '22 07:03 bruinxiong

How did you implement?

brandnewx avatar Mar 29 '22 08:03 brandnewx

@RaymondJiangkw Could you release your implmentation codes ?

Hi,

Thanks for your attention! I promised that I will release my reimplementation and I will definitely do.

I am currently doing some related research. If it is fortunate enough for me to successfully accomplish it, I will release them both at one time, including the reimplemented one! But, if it is not that fortunate, I will still release the reimplemented one. :)

Besides, as @shoutOutYangJie said, I also believe that official code will come soon. ;D

Kevin

RaymondJiangkw avatar Mar 29 '22 08:03 RaymondJiangkw

Hi, I wonder if someone has met the same issue like sunken\ambiguous face\hairs\hats\forehead? How can fix it?

https://user-images.githubusercontent.com/53079057/160777193-62355302-d000-4eac-a2e2-e9ae8bb86065.mp4

https://user-images.githubusercontent.com/53079057/160777206-cbda8a54-b4b8-4369-b620-358cd6fd7410.mp4

https://user-images.githubusercontent.com/53079057/160777214-4b5ebd96-fdba-43c7-9f42-ae6f66e08617.mp4

https://user-images.githubusercontent.com/53079057/160777260-158416a7-ec8d-4a85-a2f0-8bb9f486a557.mp4

Besides, most of the results seem reasonable. My implementation looks like: (under 128 resolution) 0_6_fixed 1_2_fixed

https://user-images.githubusercontent.com/53079057/160778514-477dea8c-94a7-4180-b825-f50ffb96866c.mp4

https://user-images.githubusercontent.com/53079057/160778529-fd38d6e9-25ca-47c4-938c-a9d7387103c9.mp4

junshutang avatar Mar 30 '22 07:03 junshutang

I wonder if someone has met the same issue

Related:

  • #11

though it was posted by you.

woctezuma avatar Mar 30 '22 11:03 woctezuma

I wonder if someone has met the same issue

Related:

though it was posted by you.

Yes. I have tried some solutions, including Pose Penalty like PiGAN or Fixed pose estimator. But there are still some failed cases. I wonder if someone has met the same issue.

junshutang avatar Mar 30 '22 11:03 junshutang

@RaymondJiangkw Could you release your implmentation codes ?

Hi,

Thanks for your attention! I promised that I will release my reimplementation and I will definitely do.

I am currently doing some related research. If it is fortunate enough for me to successfully accomplish it, I will release them both at one time, including the reimplemented one! But, if it is not that fortunate, I will still release the reimplemented one. :)

Besides, as @shoutOutYangJie said, I also believe that official code will come soon. ;D

Kevin

@RaymondJiangkw Could you release your implmentation codes ?

Hi,

Thanks for your attention! I promised that I will release my reimplementation and I will definitely do.

I am currently doing some related research. If it is fortunate enough for me to successfully accomplish it, I will release them both at one time, including the reimplemented one! But, if it is not that fortunate, I will still release the reimplemented one. :)

Besides, as @shoutOutYangJie said, I also believe that official code will come soon. ;D

Kevin

@RaymondJiangkw Thanks for your instant reply. I have been looking forward to your release.

bruinxiong avatar Mar 30 '22 14:03 bruinxiong

Hi, After days of training and debugging, my reimplementation has reached following results (currently training on ~15M images): sample0_128.mp4 sample0_512.mp4 sample1_128.mp4 sample1_512.mp4 There are a few changes I have made: 1. I use ReLU instead of SoftMax as activation function for hidden layer in the Decoder architecture. 2. The neural rendering resolution is 128 x 128 at start. 3. Blurring images at first 200K images is disabled. There must be something wrong with my reimplementation or understanding, since if I stick to the paper in these two aspects, my model diverges quickly. Besides, for simplicity, I use the default mixed precision for generator backbone and discriminator in StyleGAN2 and the probability of randomly swapping the conditioning pose of the generator with another random pose is always set to 50%. Do these two matter? However, obviously there are problems with my demonstrations: 1. There are flickering and inconsistency between image at low resolution and gone through super resolution module. 2. The result performs bad at relatively large angle, which features blurring and parts' protruding. sample2_circulation.mp4 Thus, my question is how to solve these issues? Are these two normal? If not, what should I check or do to improve the results? Thanks!

Hi, I guess the blurry texture may come from a relatively narrow pose sampling range or somehow inaccurate conditioned camera pose, since FFHQ covers a wide range of camera pose and EG3D can learn sharp texture at large angles well if the training process finally converges. My implementation looks like eg3d_reprod

As it may take a while to have an official EG3D code, do you mind sharing your implementation if possible? We all look forward to that!

FebOne1 avatar Mar 30 '22 16:03 FebOne1

I keep refreshing this page. Does anyone know if we ever get a public code release?

kogereneliteness avatar Apr 08 '22 13:04 kogereneliteness

I keep refreshing this page. Does anyone know if we ever get a public code release?

  • #3

woctezuma avatar Apr 08 '22 13:04 woctezuma

HI, guys and girls. I open my reproduce code here. https://github.com/shoutOutYangJie/EG3D-pytorch Hope you enjoy it.

@woctezuma @kogereneliteness @FebOne1 @bruinxiong @junshutang @brandnewx @CironHan @41xu @longnhatne @kuldeeppurohit

shoutOutYangJie avatar Apr 09 '22 12:04 shoutOutYangJie

@shoutOutYangJie Thanks for your reproduce code. It will be a good start. @RaymondJiangkw How about your release ?

bruinxiong avatar Apr 10 '22 01:04 bruinxiong

@shoutOutYangJie Thank you so much! I will be reviewing your code some time next week.

brandnewx avatar Apr 10 '22 01:04 brandnewx

@shoutOutYangJie Can you leave your mail address please. I find some implementation errors in your previously released code and I hope to communicate with you.

zhanglonghao1992 avatar Apr 24 '22 06:04 zhanglonghao1992

Am I misunderstanding this paper, are we not outputting a 3D mesh and Texture as an output option from this? Or is it just "fake" 3d?

kogereneliteness avatar Apr 24 '22 06:04 kogereneliteness

you can find me @.***

------------------ Original ------------------ From: ZLH @.> Date: Sun,Apr 24,2022 2:20 PM To: NVlabs/eg3d @.> Cc: shoutOutYangJie @.>, Mention @.> Subject: Re: [NVlabs/eg3d] I think I have reached reasonable results, but there are a few questions. (Issue #9)

@shoutOutYangJie Can you leave your mail address please. I find some implementation errors in your previously released code and I hope to communicate with you.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

shoutOutYangJie avatar Apr 24 '22 07:04 shoutOutYangJie

@shoutOutYangJie It seems that GitHub blocks your email, I can only see @ : ) You can find my email address in my GitHub homepage. Send me an email please~

zhanglonghao1992 avatar Apr 24 '22 07:04 zhanglonghao1992

HI, guys and girls. I open my reproduce code here. https://github.com/shoutOutYangJie/EG3D-pytorch Hope you enjoy it.

@woctezuma @kogereneliteness @FebOne1 @bruinxiong @junshutang @brandnewx @CironHan @41xu @longnhatne @kuldeeppurohit

Did the repo get deleted? The link doesn't appear to be working.

Skylion007 avatar Apr 24 '22 19:04 Skylion007

i will reopen after several days 

------------------ Original ------------------ From: Aaron Gokaslan @.> Date: Mon,Apr 25,2022 3:18 AM To: NVlabs/eg3d @.> Cc: shoutOutYangJie @.>, Mention @.> Subject: Re: [NVlabs/eg3d] I think I have reached reasonable results, butthere are a few questions. (Issue #9)

HI, guys and girls. I open my reproduce code here. https://github.com/shoutOutYangJie/EG3D-pytorch Hope you enjoy it.

@woctezuma @kogereneliteness @FebOne1 @bruinxiong @junshutang @brandnewx @CironHan @41xu @longnhatne @kuldeeppurohit

Did the repo get deleted? The link doesn't appear to be working.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

shoutOutYangJie avatar Apr 25 '22 01:04 shoutOutYangJie

I have send a email to you. just now. @zhanglonghao1992

shoutOutYangJie avatar Apr 25 '22 02:04 shoutOutYangJie

You can find here https://github.com/bruinxiong/EG3D-pytorch ;some one has folked.

shoutOutYangJie avatar Apr 25 '22 02:04 shoutOutYangJie