taming-transformers
taming-transformers copied to clipboard
Taming Transformers for High-Resolution Image Synthesis
In file `taming/models/cond_transformer` in function `encode_to_c`, permute is not used, while it is used for z. Is this intentional? Default permuter is Identity so i guess it doesn't affect much...
Hi! I love this project. It looks really promising, I would like to train it on a dataset of spectrograms in order to generate audio instead of images. Anyone tried...
Has the wheel been updated on PyPi? I'm not seeing a version number bump yet and I don't believe the `GumbelVQ` class we're trying to access is available from the...
I was trying to reproduce the label-conditioned model on ImageNet and ran into the error in the title. I think this line in the code, https://github.com/CompVis/taming-transformers/blob/9d17ea64b820f7633ea6b8823e1f78729447cb57/taming/models/cond_transformer.py#L242 should be replaced as:...
I'm trying to figure out how to train a conditional model, but am a little stumped as to the correct way to setup the config, my data structure looks this:...
Hi, Thanks for the awesome repo! I had a couple of questions about the implementation of the loss function. In the paper you multiply the entire GAN loss by the...
I am trying to train VQ-GAN on the COCO dataset, but I got reconstructed images with grid patterns on them, and the reconstructed image doesn't look like the input. Is...
I was wondering if you could give an explanation of why the last token in the sequence is dropped in the cond_transformer.py script, the paper does not give an explanation...
Hi. I have seen this sentence in doc-string of class `VectorQuantizer2` ([link](https://github.com/CompVis/taming-transformers/blob/master/taming/modules/vqvae/quantize.py#L215)): > Mostly avoids costly matrix multiplications and allows for post-hoc remapping of indices. But I do not quite...
Hi, I'm new to this field, so as part of my studies I'm trying to detect and return a Sudoku grid from and image. I know I can use the...