EG3D-projector Different dataset

Hello, thank you so much for your code! I am using it for the Shapenet dataset (and also for some other datasets) and I have managed to get reasonable results for the W and W+ projectors, now I am trying to get reasonable results from different angles (new scenes) generation (the last 2 codes: run_pti_single_image.py and gen_videos_from_given_latent_code.py) but since I am interested more in different ranges of the angles than for the precoded ffhq dataset (I want to get full 360° azimuth and slight elevation rotation around the car), do I have to change also the run_pti_single_image.py file (I changed the paths_config.py file) but I don't see variables linked to the geometry...? Thank you for your answer!

So far my results... Input image: shapenet01 W projector inversion image (after 1000 steps): 1000

Feb 05 '24 11:02 povolann

Hi @povolann!

If you've already achieved reasonable projection results, you should be able to modify gen_videos_from_given_latent_code.py to yield full 360° azimuth exploration rendering results. You can do so as follows:

In https://github.com/oneThousand1000/EG3D-projector/blob/a9c61920b5f127a2c9eba68cab7b7fc7bacf9274/eg3d/gen_videos_from_given_latent_code.py#L130

from

cam2world_pose = LookAtPoseSampler.sample(
                    3.14 / 2 + yaw_range * np.sin(2 * 3.14 * frame_idx / (num_keyframes * w_frames)),
                    3.14 / 2 - 0.05 + pitch_range * np.cos(2 * 3.14 * frame_idx / (num_keyframes * w_frames)),
                    camera_lookat_point, radius=2.7, device=device)

to

cam2world_pose = LookAtPoseSampler.sample(
                               np.pi / 2 + (frame_idx / w_frames) * 2 * np.pi,
                               np.pi / 2,
                               camera_lookat_point, radius=2.7, device=device)

Notice that in this modification, I have fixed the elevation at 90°. To implement elevation rotation, you can adjust the vertical_mean argument within LookAtPoseSampler.sample, and for azimuth adjustment, modify the horizontal_mean argument.

BTW, all of these changes were tested on the FFHQ dataset. The camera system of Shapenet might differ necessitating certain adjustments. So, be prepared to make minor modifications as needed.

Feb 05 '24 11:02 oneThousand1000

So, I have managed to get pretty good results for the pre-trained Shapenet dataset. But the problem comes when I have my own datasets which use different function get_render_pose() for sampling the poses

if cfg == 'FFHQ' or cfg == 'Shapenet':
    pitch_range = 0.25
    yaw_range = 0.35
    cam2world_pose = LookAtPoseSampler.sample(
    3.14 / 2 + yaw_range * np.sin(2 * 3.14 * frame_idx / (num_keyframes * w_frames)),
    3.14 / 2 - 0.05 + pitch_range * np.cos(2 * 3.14 * frame_idx / (num_keyframes * w_frames)),
    camera_lookat_point, radius=G.rendering_kwargs['avg_camera_radius'], device=device) # torch.Size([1, 4, 4]), cuda
elif cfg == 'drr' or cfg == 'Carla':
    cam2world_pose = get_render_pose(radius=10.5, phi=phi + 360/(num_keyframes * w_frames) * frame_idx, theta=45).unsqueeze(0).to(device)

I have trained the models with this setup, co when I run the code convert_pkl_2_pth.py, for the pkl I get a nice result, but for the pth the result looks super weird, so I guess there is some problem with the model conversion. Do you have any idea what it might be?

https://github.com/oneThousand1000/EG3D-projector/assets/31487206/ead15b23-9808-4e84-8661-dcbc612a8fe6

Actually, I have uploaded the code as Jupyter Notebook to Google colab. Any help is greatly appreciated :)

Apr 15 '24 10:04 povolann

Hi,

Sorry, I think the weird results may be caused by my clumsy init_kwargs and rendering_kwargs settings in: https://github.com/oneThousand1000/EG3D-projector/blob/a9c61920b5f127a2c9eba68cab7b7fc7bacf9274/eg3d/convert_pkl_2_pth.py#L313 , https://github.com/oneThousand1000/EG3D-projector/blob/a9c61920b5f127a2c9eba68cab7b7fc7bacf9274/eg3d/convert_pkl_2_pth.py#L326 , https://github.com/oneThousand1000/EG3D-projector/blob/a9c61920b5f127a2c9eba68cab7b7fc7bacf9274/eg3d/gen_videos_from_given_latent_code.py#L298 , and https://github.com/oneThousand1000/EG3D-projector/blob/a9c61920b5f127a2c9eba68cab7b7fc7bacf9274/eg3d/gen_videos_from_given_latent_code.py#L312

I hardcoded those settings, and they are only suitable for ffhq model.

To load the Shapenet model parameters and configs, you can replace those redundant lines with a more flexible version as following: https://github.com/NVlabs/eg3d/blob/7cf1fd1e99e1061e8b6ba850f91c94fe56e7afe4/eg3d/gen_samples.py#L146

Apr 15 '24 12:04 oneThousand1000

I rewrote this part previously too but actually I think the problem is in saving and loading the pth model like state_dict()

save_dict = {'G_ema': G.state_dict()}
...
G_new.load_state_dict(ckpt['G_ema'], strict=False)

So the loading part of the code looks now like this

    print('Loading networks from "%s"...' % network_pkl)
    device = torch.device('cuda')
    with dnnlib.util.open_url(network_pkl) as f:
        G = legacy.load_network_pkl(f)['G_ema'].to(device)  # type: ignore

    G.eval()
    G.rendering_kwargs['depth_resolution'] = int(G.rendering_kwargs['depth_resolution'] * sampling_multiplier)
    G.rendering_kwargs['depth_resolution_importance'] = int(
        G.rendering_kwargs['depth_resolution_importance'] * sampling_multiplier)

    network_pth=network_pkl.replace('pkl','pth')
    print('Save pth to', network_pth)
    torch.save(G, network_pth)

    print("Reloading Modules!")
    G_new = torch.load(network_pth)
    G_new.eval()

    if nrr is not None: G.neural_rendering_resolution = nrr

And it works. (I have tested it even for other datasets with very different geometry.)

https://github.com/oneThousand1000/EG3D-projector/assets/31487206/8b8fe96c-31e6-40c1-87f5-41caab9757b0

Apr 17 '24 08:04 povolann

Looks great!

I will pin this issue for anyone who wants to use the Shapenet model.

Apr 17 '24 08:04 oneThousand1000

EG3D-projector EG3D-projector copied to clipboard

Different dataset

EG3D-projector
EG3D-projector copied to clipboard