EG3D-projector
EG3D-projector copied to clipboard
poor quality of one random image
I just try a random image and i find the quality of inversion is very poor.
Do you have any ideas about it?

Did you align the input image according to https://github.com/NVlabs/eg3d/blob/main/dataset_preprocessing/ffhq/preprocess_in_the_wild.py ?
Did you align the input image according to https://github.com/NVlabs/eg3d/blob/main/dataset_preprocessing/ffhq/preprocess_in_the_wild.py ?
I am also having poor results when following the steps in the original eg3d repo (wild img following preprocessing steps with ffhq pretrained model). The arguments used are all defaults/same as the ones in README. Do you have any ideas to improve the results? Thanks!
test img:
results:
https://user-images.githubusercontent.com/46330265/213952911-8ca9a2a8-b15a-4a5b-9d72-6b12861fdda2.mp4
Did you align the input image according to https://github.com/NVlabs/eg3d/blob/main/dataset_preprocessing/ffhq/preprocess_in_the_wild.py ?
I am also having poor results when following the steps in the original eg3d repo (wild img following preprocessing steps with ffhq pretrained model). The arguments used are all defaults/same as the ones in README. Do you have any ideas to improve the results? Thanks!
test img:
results:
img00000366_w_plus_pretrained.mp4
Hey, I think it is caused by the EG3D model itself and my simple inversion project. For the EG3D model, the performance on extreme posture is much worse than frontal, it is due to the imbalanced pose distribution in FFHQ. It is still a challenging problem. For this simple inversion project, the input image only provides information from a single view, and it is hard for PTI to generate full-view results. So I recommend you to use a better inversion method, e.g., https://github.com/jiaxinxie97/HFGI3D
Actually, this repo is just a simple implementation of the projector mentioned in EG3D, not the best choice for projecting an image into EG3D's latent space :)
Thanks! But from the official website, seems they are able to get pretty good results with single img + PTI tho.
https://user-images.githubusercontent.com/46330265/213961159-71308bd9-0ffc-4abc-b45c-8925cc5cb0d5.mp4
Thanks! But from the official website, seems they are able to get pretty good results with single img + PTI tho.
inversion_compressed.mp4
The input image you use has ear occluded, the input images in the video contain more complete information. This repo cannot generate the regions that are occluded.
You can see the results I generated using my repo: https://github.com/NVlabs/eg3d/issues/28#issuecomment-1159512947,
here is the input re-aligned image and the input camera parameters:
01457.zip
Weird that I am getting a slightly different camera matrix than yours after following pytorch_3d_recon:
mine:
[
0.9982488751411438,
0.01629943959414959,
-0.056863944977521896,
0.14564249100599475,
0.010219544172286987,
-0.9943544864654541,
-0.1056165024638176,
0.2914214260210597,
-0.05826440826058388,
0.10485044121742249,
-0.9927797317504883,
2.6802727132270365,
0.0,
0.0,
0.0,
1.0,
4.2647,
0.0,
0.5,
0.0,
4.2647,
0.5,
0.0,
0.0,
1.0
]
yours:
array([ 0.99852723, 0.01640092, -0.05171374, 0.13343237, 0.01112113,
-0.9948467 , -0.10077892, 0.27816952, -0.05310011, 0.10005538,
-0.99356395, 2.6823157 , 0. , 0. , 0. ,
1. , 4.2647 , 0. , 0.5 , 0. ,
4.2647 , 0.5 , 0. , 0. , 1. ])
Weird that I am getting a slightly different camera matrix than yours after following pytorch_3d_recon:
mine:
[ 0.9982488751411438, 0.01629943959414959, -0.056863944977521896, 0.14564249100599475, 0.010219544172286987, -0.9943544864654541, -0.1056165024638176, 0.2914214260210597, -0.05826440826058388, 0.10485044121742249, -0.9927797317504883, 2.6802727132270365, 0.0, 0.0, 0.0, 1.0, 4.2647, 0.0, 0.5, 0.0, 4.2647, 0.5, 0.0, 0.0, 1.0 ]
yours:
array([ 0.99852723, 0.01640092, -0.05171374, 0.13343237, 0.01112113, -0.9948467 , -0.10077892, 0.27816952, -0.05310011, 0.10005538, -0.99356395, 2.6823157 , 0. , 0. , 0. , 1. , 4.2647 , 0. , 0.5 , 0. , 4.2647 , 0.5 , 0. , 0. , 1. ])
Yes, the matrix I uploaded is directly obtained from the dataset.json (in https://github.com/NVlabs/eg3d/blob/main/dataset_preprocessing/ffhq/runme.py), and the image is from the ffhq dataset. It is ok to use a slightly different matrix.
Hello, I have a follow-up question regarding your implementation here. You might have also noticed the problem here that feeding optimized latent code directly into the generator without mapping network i.e. no camera information included.
I also tried directly using optimized ws
but find that it may contain some artifact regarding shalini's example:
https://user-images.githubusercontent.com/46330265/225710735-7ca05898-b49b-41c3-bb67-c463c8eb5265.mp4
There are some artifacts existing out there:
I went through the issue post in the original eg3d repo but didn't find any useful information. Any takeaway conclusion without including camera information in the ws
from your experiments so far?
谢谢!但是从官方网站来看,他们似乎能够通过单个 img + PTI 获得相当不错的结果。 inversion_compressed.mp4
您使用的输入图像有耳朵遮挡,视频中的输入图像包含更完整的信息。此 repo 无法生成被遮挡的区域。
你可以看到我使用我的 repo 生成的结果:NVlabs/eg3d#28 (comment), 这里是输入重新对齐的图像和输入相机参数: 01457.zip
Why does the regenerated image look different from the original image. It should be noted that this image is not from the ffhq dataset
regenerated image
source image
谢谢!但是从官方网站来看,他们似乎能够通过单个 img + PTI 获得相当不错的结果。 inversion_compressed.mp4
您使用的输入图像有耳朵遮挡,视频中的输入图像包含更完整的信息。此 repo 无法生成被遮挡的区域。 你可以看到我使用我的 repo 生成的结果:NVlabs/eg3d#28 (comment), 这里是输入重新对齐的图像和输入相机参数: 01457.zip
Why does the regenerated image look different from the original image. It should be noted that this image is not from the ffhq dataset
regenerated image
source image
What do you mean by saying "regenerated image looks different from the original image"? If you mean that the regenerated image can not catch some fine-level details in the original image, it is caused by the expression power of the adversarial generative network. If you want to preserve the details, you can try to use https://github.com/jiaxinxie97/HFGI3D, which can achieve better performance than my simple projector implementation.