eg3d
eg3d copied to clipboard
I think I have reached reasonable results, but there are a few questions.
Hi,
After days of training and debugging, my reimplementation has reached following results (currently training on ~15M images):
https://user-images.githubusercontent.com/53508410/152667374-83205472-9158-4d93-886c-38391399f4e8.mp4
https://user-images.githubusercontent.com/53508410/152667375-36c82fc2-236f-477a-abd0-a7693a8e86a0.mp4
https://user-images.githubusercontent.com/53508410/152667475-31d6a910-99d8-4c28-b636-b1f2e8ee7c71.mp4
https://user-images.githubusercontent.com/53508410/152667474-c88a874b-5507-4b60-b62d-9c3ab3178d86.mp4
There are a few changes I have made: 1. I use ReLU instead of SoftMax as activation function for hidden layer in the Decoder architecture. 2. The neural rendering resolution is 128 x 128 at start. 3. Blurring images at first 200K images is disabled.
There must be something wrong with my reimplementation or understanding, since if I stick to the paper in these two aspects, my model diverges quickly. Besides, for simplicity, I use the default mixed precision for generator backbone and discriminator in StyleGAN2 and the probability of randomly swapping the conditioning pose of the generator with another random pose is always set to 50%. Do these two matter?
However, obviously there are problems with my demonstrations: 1. There are flickering and inconsistency between image at low resolution and gone through super resolution module. 2. The result performs bad at relatively large angle, which features blurring and parts' protruding.
https://user-images.githubusercontent.com/53508410/152667634-e3f2d409-f5a6-456a-bca8-65ed9950af9f.mp4
Thus, my question is how to solve these issues? Are these two normal? If not, what should I check or do to improve the results?
Thanks!
The interesting fact is that after applying truncation trick, the result becomes much more better. (The left one is generated w/o truncation trick, namely setting truncation_psi
to be 1, while the right one is generated w truncation trick, specifically truncation_psi
is set to be 0.5. Both of them are generated from same z
.)
https://user-images.githubusercontent.com/53508410/152785987-1037138a-76cf-4a9e-9c84-3a424da826ef.mp4
And it appears that mixed precision really matters. 😂
Cool!! Do you mind sharing your code as a reference?
Hi FebOne1,
There are still problems with my implementation waiting to be solved. I think it is still a bit premature, thus I propose issue here for suggestions. I will release my implementation until I reach the official demonstration.
Besides, I think official code will be released soon actually.
Great results! What is the fid score on FFHQ@256?
Currently, the fid score of FFHQ@512 is 9.75, which is still higher than the official one.
Hi FebOne1,
There are still problems with my implementation waiting to be solved. I think it is still a bit premature, thus I propose issue here for suggestions. I will release my implementation until I reach the official demonstration.
Besides, I think official code will be released soon actually.
I was wondering how do you set up ray sampling after getting the estimated poses by https://github.com/microsoft/Deep3DFaceReconstruction? As they assume the camera origin is (0,0,10), how do you set the start and end range of integration? Appreciate your reply!
Hi FebOne1,
Actually, I do not follow the schema of estimating poses stated in the paper. I use FLAME and DECA to fit head model, and retrieve its pitch, yaw and roll. The precise computation of transforming coordinate systems is a bit complicated, which may not be stated clearly in text.
For your reference, and as stated in other issue, you can check ray sampling code of CIPS-3D and pi-GAN. They just need some modifications.
Kevin
I also try to implement it. But I find some problems. When I only use low resolution, the synthetic background loss, which is all black. When I only use high resolution, the background is filled. However, the hat of head is bad. I guess the generator only synthesizes the volume of the face and head, drop all irrelative things about hat and background.
The following two pictures are low-resolution generated images and high-resolution generated images, respectively. Note that two pictures use different noise, which is not corresponding.
By the way, Can I add your WeChat for further contact? My wechat is "ShoutOutAjie"
I also use "march cube" to reconstruct the 3d volume. I find my result has no background and hat, which is not correct with the paper.
Hi shoutOutYangJie,
Can you share your decoder architecture? I am really curious about it, since the original paper states that they use SoftMax activation. However, after many experiments, I failed, thus, I wonder there might be some misunderstandings or tricks?
Thanks! Kevin
I can share my code. Can you give me your email? I want to contact with you.
I can share my code. Can you give me your email? I want to contact with you.
Hi, I am also interested in your re-implementation details, could you share it with me? You can find me at [email protected]. Thanks so much!
Hey guys, I have some questions about the camera params P(25 scalars) to be conveyed in mapping network and neural renderer block. So how can we gain the params? I'm still confused about how to use off-the-shelf pose detectors to gain the camera params. Is it the ray_directions, ray_origins, focal and some camera_to_world params? But why the 25 scalars.... @shoutOutYangJie @RaymondJiangkw
4 * 4 = 16 extrinsics 3 * 3 =9 intrinsics Totally 25
I can share my code. Can you give me your email? I want to contact with you.
Hi @shoutOutYangJie, can you please share your code with me and I can use yours as a reference. My email is [email protected] Thank you much !!
Hi @shoutOutYangJie, could you share your code with me? I am at [email protected]
the official code will open soon
------------------ Original ------------------ From: Kuldeep Purohit @.> Date: Mon,Mar 14,2022 4:27 PM To: NVlabs/eg3d @.> Cc: shoutOutYangJie @.>, Mention @.> Subject: Re: [NVlabs/eg3d] I think I have reached reasonable results, butthere are a few questions. (Issue #9)
Hi @shoutOutYangJie, could you share your code with me? I am at @.***
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>
Hi,
After days of training and debugging, my reimplementation has reached following results (currently training on ~15M images):
sample0_128.mp4 sample0_512.mp4 sample1_128.mp4 sample1_512.mp4 There are a few changes I have made: 1. I use ReLU instead of SoftMax as activation function for hidden layer in the Decoder architecture. 2. The neural rendering resolution is 128 x 128 at start. 3. Blurring images at first 200K images is disabled.
There must be something wrong with my reimplementation or understanding, since if I stick to the paper in these two aspects, my model diverges quickly. Besides, for simplicity, I use the default mixed precision for generator backbone and discriminator in StyleGAN2 and the probability of randomly swapping the conditioning pose of the generator with another random pose is always set to 50%. Do these two matter?
However, obviously there are problems with my demonstrations: 1. There are flickering and inconsistency between image at low resolution and gone through super resolution module. 2. The result performs bad at relatively large angle, which features blurring and parts' protruding.
sample2_circulation.mp4 Thus, my question is how to solve these issues? Are these two normal? If not, what should I check or do to improve the results?
Thanks!
Hi, I guess the blurry texture may come from a relatively narrow pose sampling range or somehow inaccurate conditioned camera pose, since FFHQ covers a wide range of camera pose and EG3D can learn sharp texture at large angles well if the training process finally converges. My implementation looks like
Hi, After days of training and debugging, my reimplementation has reached following results (currently training on ~15M images): sample0_128.mp4 sample0_512.mp4 sample1_128.mp4 sample1_512.mp4 There are a few changes I have made: 1. I use ReLU instead of SoftMax as activation function for hidden layer in the Decoder architecture. 2. The neural rendering resolution is 128 x 128 at start. 3. Blurring images at first 200K images is disabled. There must be something wrong with my reimplementation or understanding, since if I stick to the paper in these two aspects, my model diverges quickly. Besides, for simplicity, I use the default mixed precision for generator backbone and discriminator in StyleGAN2 and the probability of randomly swapping the conditioning pose of the generator with another random pose is always set to 50%. Do these two matter? However, obviously there are problems with my demonstrations: 1. There are flickering and inconsistency between image at low resolution and gone through super resolution module. 2. The result performs bad at relatively large angle, which features blurring and parts' protruding. sample2_circulation.mp4 Thus, my question is how to solve these issues? Are these two normal? If not, what should I check or do to improve the results? Thanks!
Hi, I guess the blurry texture may come from a relatively narrow pose sampling range or somehow inaccurate conditioned camera pose, since FFHQ covers a wide range of camera pose and EG3D can learn sharp texture at large angles well if the training process finally converges. My implementation looks like
Hi,
Thanks for your advice, but I don't think they apply to this situation. :)
I have tested my model in low resolution, and am able to reach good result even at pi/2 side-view after training for ~3M images. The demonstration I showed here was based on the model, which had been trained for 15M images instead of 25M images described in the paper. In other words, as you have said, the training process hadn't been converged. 😂
Again, thanks for your advice!
Best, Kevin
@RaymondJiangkw Could you release your implmentation codes ?
How did you implement?
@RaymondJiangkw Could you release your implmentation codes ?
Hi,
Thanks for your attention! I promised that I will release my reimplementation and I will definitely do.
I am currently doing some related research. If it is fortunate enough for me to successfully accomplish it, I will release them both at one time, including the reimplemented one! But, if it is not that fortunate, I will still release the reimplemented one. :)
Besides, as @shoutOutYangJie said, I also believe that official code will come soon. ;D
Kevin
Hi, I wonder if someone has met the same issue like sunken\ambiguous face\hairs\hats\forehead? How can fix it?
https://user-images.githubusercontent.com/53079057/160777193-62355302-d000-4eac-a2e2-e9ae8bb86065.mp4
https://user-images.githubusercontent.com/53079057/160777206-cbda8a54-b4b8-4369-b620-358cd6fd7410.mp4
https://user-images.githubusercontent.com/53079057/160777214-4b5ebd96-fdba-43c7-9f42-ae6f66e08617.mp4
https://user-images.githubusercontent.com/53079057/160777260-158416a7-ec8d-4a85-a2f0-8bb9f486a557.mp4
Besides, most of the results seem reasonable. My implementation looks like: (under 128 resolution)
https://user-images.githubusercontent.com/53079057/160778514-477dea8c-94a7-4180-b825-f50ffb96866c.mp4
https://user-images.githubusercontent.com/53079057/160778529-fd38d6e9-25ca-47c4-938c-a9d7387103c9.mp4
I wonder if someone has met the same issue
Related:
- #11
though it was posted by you.
I wonder if someone has met the same issue
Related:
though it was posted by you.
Yes. I have tried some solutions, including Pose Penalty like PiGAN or Fixed pose estimator. But there are still some failed cases. I wonder if someone has met the same issue.
@RaymondJiangkw Could you release your implmentation codes ?
Hi,
Thanks for your attention! I promised that I will release my reimplementation and I will definitely do.
I am currently doing some related research. If it is fortunate enough for me to successfully accomplish it, I will release them both at one time, including the reimplemented one! But, if it is not that fortunate, I will still release the reimplemented one. :)
Besides, as @shoutOutYangJie said, I also believe that official code will come soon. ;D
Kevin
@RaymondJiangkw Could you release your implmentation codes ?
Hi,
Thanks for your attention! I promised that I will release my reimplementation and I will definitely do.
I am currently doing some related research. If it is fortunate enough for me to successfully accomplish it, I will release them both at one time, including the reimplemented one! But, if it is not that fortunate, I will still release the reimplemented one. :)
Besides, as @shoutOutYangJie said, I also believe that official code will come soon. ;D
Kevin
@RaymondJiangkw Thanks for your instant reply. I have been looking forward to your release.
Hi, After days of training and debugging, my reimplementation has reached following results (currently training on ~15M images): sample0_128.mp4 sample0_512.mp4 sample1_128.mp4 sample1_512.mp4 There are a few changes I have made: 1. I use ReLU instead of SoftMax as activation function for hidden layer in the Decoder architecture. 2. The neural rendering resolution is 128 x 128 at start. 3. Blurring images at first 200K images is disabled. There must be something wrong with my reimplementation or understanding, since if I stick to the paper in these two aspects, my model diverges quickly. Besides, for simplicity, I use the default mixed precision for generator backbone and discriminator in StyleGAN2 and the probability of randomly swapping the conditioning pose of the generator with another random pose is always set to 50%. Do these two matter? However, obviously there are problems with my demonstrations: 1. There are flickering and inconsistency between image at low resolution and gone through super resolution module. 2. The result performs bad at relatively large angle, which features blurring and parts' protruding. sample2_circulation.mp4 Thus, my question is how to solve these issues? Are these two normal? If not, what should I check or do to improve the results? Thanks!
Hi, I guess the blurry texture may come from a relatively narrow pose sampling range or somehow inaccurate conditioned camera pose, since FFHQ covers a wide range of camera pose and EG3D can learn sharp texture at large angles well if the training process finally converges. My implementation looks like
As it may take a while to have an official EG3D code, do you mind sharing your implementation if possible? We all look forward to that!
I keep refreshing this page. Does anyone know if we ever get a public code release?
I keep refreshing this page. Does anyone know if we ever get a public code release?
- #3
HI, guys and girls. I open my reproduce code here. https://github.com/shoutOutYangJie/EG3D-pytorch Hope you enjoy it.
@woctezuma @kogereneliteness @FebOne1 @bruinxiong @junshutang @brandnewx @CironHan @41xu @longnhatne @kuldeeppurohit
@shoutOutYangJie Thanks for your reproduce code. It will be a good start. @RaymondJiangkw How about your release ?
@shoutOutYangJie Thank you so much! I will be reviewing your code some time next week.
@shoutOutYangJie Can you leave your mail address please. I find some implementation errors in your previously released code and I hope to communicate with you.
Am I misunderstanding this paper, are we not outputting a 3D mesh and Texture as an output option from this? Or is it just "fake" 3d?
you can find me @.***
------------------ Original ------------------ From: ZLH @.> Date: Sun,Apr 24,2022 2:20 PM To: NVlabs/eg3d @.> Cc: shoutOutYangJie @.>, Mention @.> Subject: Re: [NVlabs/eg3d] I think I have reached reasonable results, but there are a few questions. (Issue #9)
@shoutOutYangJie Can you leave your mail address please. I find some implementation errors in your previously released code and I hope to communicate with you.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
@shoutOutYangJie It seems that GitHub blocks your email, I can only see @ : ) You can find my email address in my GitHub homepage. Send me an email please~
HI, guys and girls. I open my reproduce code here. https://github.com/shoutOutYangJie/EG3D-pytorch Hope you enjoy it.
@woctezuma @kogereneliteness @FebOne1 @bruinxiong @junshutang @brandnewx @CironHan @41xu @longnhatne @kuldeeppurohit
Did the repo get deleted? The link doesn't appear to be working.
i will reopen after several days
------------------ Original ------------------ From: Aaron Gokaslan @.> Date: Mon,Apr 25,2022 3:18 AM To: NVlabs/eg3d @.> Cc: shoutOutYangJie @.>, Mention @.> Subject: Re: [NVlabs/eg3d] I think I have reached reasonable results, butthere are a few questions. (Issue #9)
HI, guys and girls. I open my reproduce code here. https://github.com/shoutOutYangJie/EG3D-pytorch Hope you enjoy it.
@woctezuma @kogereneliteness @FebOne1 @bruinxiong @junshutang @brandnewx @CironHan @41xu @longnhatne @kuldeeppurohit
Did the repo get deleted? The link doesn't appear to be working.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
I have send a email to you. just now. @zhanglonghao1992
You can find here https://github.com/bruinxiong/EG3D-pytorch ;some one has folked.