ISSA icon indicating copy to clipboard operation
ISSA copied to clipboard

Some questions

Open taluos opened this issue 1 year ago • 7 comments

Dear Author,

Thank you very much for your work. I have a few questions I would like to ask: Have you ever used human face datasets (such as FFHQ) for training? If so, how were the results? If not, can I directly use ISSA for related training?

taluos avatar Aug 09 '24 09:08 taluos

Hi @taluos , thanks for your interests. At the beginning of the project, we have tried on face datasets, which ISSA also results in better reconstruction quality. But the hyperparameters, e.g, mask size and random masking ratio, were set differently as far as I remember.

YumengLi007 avatar Aug 09 '24 13:08 YumengLi007

Hello! Thank you very much for your response. Addiction, I noticed that in your paper, you mentioned training the StyleGAN to synthesize images before training the encoder. I’m wondering if I can skip this step if I use a pre-trained StyleGAN model? Also, I’m not very clear about the function of data_fake in the configuration file. It seems to contain images generated by StyleGAN, but I’m unsure of its role in the overall training process.

taluos avatar Aug 10 '24 08:08 taluos

Hi @taluos ,

  1. Yes, you could directly use a pretrained StyleGAN models. Just there wasn't one trained on Cityscapes, so I trained one myself.
  2. Right, it contains images generated by StyleGAN together with the style latent w. They are used as regularization so tthat the inverted codes can stay close to the original latent space. Also we observed this can speed up training convergence. Please see Eq.(8) in the paper. image

YumengLi007 avatar Aug 10 '24 09:08 YumengLi007

Thanks again @YumengLi007 , I see. So, before training the encoder, I generate some images, and during the training process, using the encoder to obtain their latent codes. In that case, how many images do I need to generate for training?

I tried training, but I encountered the following error: Traceback (most recent call last): File "train_encoder.py", line 493, in <module> main(rank=0) File "train_encoder.py", line 211, in main resume_data = misc.load_network_pkl(f) AttributeError: module 'torch_utils.misc' has no attribute 'load_network_pkl' I obtained torch_utils from https://github.com/NVlabs/stylegan3, but it doesn't include the load_network_pkl method. I only found that this method exists in a similar file from https://github.com/NVlabs/stylegan. Could you provide your torch_utils file, or could you advise me on how to modify it?

taluos avatar Aug 10 '24 13:08 taluos

Hi @taluos

  1. 50k images should be enough.
  2. You might found this issue helpful :) https://github.com/boschresearch/ISSA/issues/7

YumengLi007 avatar Aug 10 '24 13:08 YumengLi007

I encountered the following error during the training process:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/opt/data/private/ISSA/ISSA-main/ISSA-main/train_encoder.py", line 279, in main
    E=encoder,D=D_enc, G=generator, percept=percept,
UnboundLocalError: local variable 'D_enc' referenced before assignment

It seems to be caused by the fact that I am training with two GPUs, which results in the GPU with rank = 1 not having this variable. How can I resolve this issue?"

taluos avatar Aug 11 '24 06:08 taluos

I simply modified the following code so that this part is executed on both rank = 0 and 1, which solved the issue, but I'm not sure if this is the correct approach.

if rank == 0:
        print('Setting Discriminator...')
D_channel = training_set.num_channels
common_kwargs = dict(input_nc=D_channel, getIntermFeat=True)
D_enc = create_class_by_name(**config.enc_D_kwargs, **common_kwargs).train().requires_grad_(
        False).to(device) # subclass of torch.nn.Module

taluos avatar Aug 11 '24 06:08 taluos