StyleSpace icon indicating copy to clipboard operation
StyleSpace copied to clipboard

Manipulation using W+ latents

Open yaseryacoob opened this issue 4 years ago • 4 comments

The work is very interesting. The discussion at the end of the paper regarding inversion (and the associated Figures 18+) are intriguing. I have two questions

  1. Can you share the code for inversion for S space?
  2. I would like to test the Style channels on precomputed W+ latents as an input (like Figure 19). I am not sure where exactly to inject that in your code.

thanks

yaseryacoob avatar Apr 06 '21 16:04 yaseryacoob

Dear Yaseryacoob,

Thank you for your interest in our work.

  1. e4e encoder does a better job than our encoder in terms of reconstruction. Although they use W+ space rathan S space, they try to make the W+ vector closed to W vector and achieve good manipulability. I suggest try to use their encoder.

  2. To manipulate custom images (W+), please open our single channl colab, upload the w+ numpy array, and add following codes right after 'Select dataset' section.

latents=np.load('w_plus.pt')
M.dlatents=M.W2S(w_plus)

If you want to do real images manipulation, please upload the output 'latents.pt' from e4e encoder. Then add following codes right after 'Select dataset' section.

latents=torch.load('latents.pt')
w_plus=latents.cpu().detach().numpy()
M.dlatents=M.W2S(w_plus)

Best Wishes,

Alex

betterze avatar Apr 07 '21 14:04 betterze

I am missing something, I tried an inverted latent, 18x512, attached, and the odd thing is that the size of M.dlatents[0] is 1,512 where it is 2000,512 if I use your original code in the colab. latent.pt.txt,

here it is as you suggested:

M=Manipulator(dataset_name=dataset_name) if 1: map_location=torch.device('cpu') latents=torch.load('latent.pt', map_location=map_location) w_plus=latents.cpu().detach().numpy() M.dlatents=M.W2S(w_plus)

I am missing something, The .pt is 1,18,512.

yaseryacoob avatar Apr 07 '21 21:04 yaseryacoob

Dear Yaseryacoob,

The M.dlatents is in the first layer of S vector, which is length 512, the 2000 is number of examples. For details of the relation between S and W+ space, please refer to Figure 9 and Table 2 in our supplementary.

Since the the latent.pt you provided only contain 1 images, please change the M.img_index in the next cell 'choose attribute' to be 0.

I will the latent.pt you upload, just change M.img_index to be 0 make everything works. The image is a man with short hair.

Best Wishes,

Alex

betterze avatar Apr 07 '21 22:04 betterze

Dear Alex, This solved it, it didn't occur to me until you mentioned it. I again thanks you for sharing the code and the effort you put into it. So far, after a couple of examples I think you S space is better than any of the ones I have seen in the last few months. I will experiment with it some more tomorrow to see how it performs on different faces. Some faces are more entangled than others (in latent space) for unknown reasons. I am curious about the following

  1. Can one apply multiple attributes together (like wavy hair and grey hair or even 3-4 attributes)?
  2. Is there a way to mix your latents with other latents ? Essentially different researchers are uncovering different attribute pathway and it would be good to have a process to blend them. Imagine 0.3A+0.2B+4C+0.1D where the letters are derived from different principal directions. On one hand everybody is using STylegan2, but on the other hand it is not clear how entangled things are if one is not very careful.

There are any number of open challenges in this space. I will be glad to brain storm if you send me email to [email protected]

Thanks again!!

yaseryacoob avatar Apr 07 '21 23:04 yaseryacoob