vq-vae-2-pytorch icon indicating copy to clipboard operation
vq-vae-2-pytorch copied to clipboard

Conditioned Sample

Open Arstanley opened this issue 5 years ago • 11 comments

In the pixel snail paper, it is able to generate a conditioned sample with some global condition h I am just wondering if it is able to do that in the current stage of implementation?

Arstanley avatar Jul 03 '19 21:07 Arstanley

Sorry for late reply. Currently this implementation does not support class conditional generation. Some modification will be needed like injecting conditions in top pixelsnail networks.

rosinality avatar Jul 15 '19 06:07 rosinality

Hello, if I use natural images for training, can I use your network to generate natural images(when running sample.py)? The second question is, Whether it's a viable thing to add conditions to your network if I want to generate certain types of natural images? Thanks.

Dashi-1997 avatar Jul 02 '20 07:07 Dashi-1997

Yes, you can use for that as pixelsnail itself was used for natural images. Also you can use conditions on it, and actually pixelsnail for bottom code is conditioned on top code. But if you want to use conditions like categories, maybe another approaches could be more appropriate. (This implementation uses spatial feature grids of top codes as conditions.)

rosinality avatar Jul 02 '20 12:07 rosinality

Thank you for your reply!My idea is to use a signal (4000 dimension) as a condition, can I modify your network to achieve this?

Dashi-1997 avatar Jul 02 '20 12:07 Dashi-1997

Yes. You can try some methods to incorporate it as conditions.

rosinality avatar Jul 02 '20 13:07 rosinality

Thanks! Do you mean that I can do the same thing as the bottom did at the top, and then splitter the conditions as vectors to the top,? And if so, do I need to modify the loss function? My goal is to generate the corresponding image just by giving a signal to the top layer as the condition during the test. Is this kind of sampling feasible?

Dashi-1997 avatar Jul 02 '20 13:07 Dashi-1997

There are many options. You can use conditioning mechanism in current implementation. You only need to transform your signal vector to spatial (2d, NCHW) feature maps. (You can simply use tiling tensor/nearest neighbor upsampling) You will need to replace https://github.com/rosinality/vq-vae-2-pytorch/blob/master/pixelsnail.py#L416 these lines that converts discrete codes to spatial feature maps to your code. I think this will work.

Also you can use the method that applies affine transforms on channel dimensions which is widely used in GANs. Maybe you can refer to FiLM(https://arxiv.org/abs/1709.07871), SNGAN(https://arxiv.org/abs/1802.05637)/BigGAN(https://arxiv.org/abs/1809.11096), StyleGAN (https://arxiv.org/abs/1812.04948). You will need more modifications, but it will be not very hard.

rosinality avatar Jul 02 '20 13:07 rosinality

Thanks a lot! I will follow your suggestion and try it.

Dashi-1997 avatar Jul 02 '20 14:07 Dashi-1997

In case you'd want to generate an image based on a text description how would you condition the model? Would this be necessary only in the top layer of the PixelSnail or also in the VQVAE? Thank you.

inferense avatar Aug 10 '20 16:08 inferense

@maan198 I think conditioning prior (PixelSNAIL) would be enough.

rosinality avatar Aug 11 '20 10:08 rosinality

There are many options. You can use conditioning mechanism in current implementation. You only need to transform your signal vector to spatial (2d, NCHW) feature maps. (You can simply use tiling tensor/nearest neighbor upsampling) You will need to replace https://github.com/rosinality/vq-vae-2-pytorch/blob/master/pixelsnail.py#L416 these lines that converts discrete codes to spatial feature maps to your code. I think this will work.

Also you can use the method that applies affine transforms on channel dimensions which is widely used in GANs. Maybe you can refer to FiLM(https://arxiv.org/abs/1709.07871), SNGAN(https://arxiv.org/abs/1802.05637)/BigGAN(https://arxiv.org/abs/1809.11096), StyleGAN (https://arxiv.org/abs/1812.04948). You will need more modifications, but it will be not very hard.

Thanks for the details. Could you please elaborate more on the conditioning mechanism in the current implementation? if I want to condition on a source image, can I use the same mechanism on the top layer of pixelsnail? which is to condition samples from the source image to sampling top code.

jungangc avatar Jul 12 '22 05:07 jungangc