rq-vae-transformer icon indicating copy to clipboard operation
rq-vae-transformer copied to clipboard

The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)

Results 12 rq-vae-transformer issues
Sort by recently updated
recently updated
newest added

As the training takes super long time, would you mind uploading the training logs corresponding to [the released commands](https://github.com/kakaobrain/rq-vae-transformer#training-of-rq-vaes)?

There exists a very trivial typo when writing {args.split}_error_list.txt

Thanks for your work! Does this function only work during the training of the transformer at stage 2? https://github.com/kakaobrain/rq-vae-transformer/blob/2bf6ece4b85608cfae4c0e2969b17f75495e1639/rqvae/models/rqvae/quantizations.py#L372

First of all, thank you all the authors for releasing this remarkable researches and models! I tried to finetune this RQ-Transformer model(3.9B) at certain domain. (I'm already aware that it...

Hi, would you be interested in adding rq-vae-transformer to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of...

I was able to get the full-size parameter set working locally on my personal dev machine (16 samples on a RTX 3090,) but I had to disable mixed precision and...

In my case, during T2I sampling In the part of declaring the following variable When using as int type like this ``` top_k=1024 top_p=1 ``` I met the following error...

rqvae/models/rqtransformer/primitives.py class LogitMask(nn.Module): def __init__(self, vocab_size: Iterable[int], value=-1e6): super().__init__() self.vocab_size = vocab_size self.mask_cond = [vocab_size[0]]*len(vocab_size) != vocab_size self.value = value def forward(self, logits: Tensor) -> Tensor: if not self.mask_cond: return...

I trt to retrain rq-vae on FFHQ with the default ffhq256-rqvae-8x8x4.yaml. The training loss first decrease and then increase. ![image](https://user-images.githubusercontent.com/32598987/223308206-0dc2cab7-9583-48c2-a229-16041b544f6e.png) and Then I compute rfid using the model with the...

I want to get the reconstruct image on my own dataset,but I just find the code to compute rFID. Which code can I use to pretrained reconstruct?