EXIT
EXIT
Hi Syed, 1. **Evaluation (Table 5):** we used same text prompt for evaluation since there is ground truth to evaluate the different text prompts. 2. **Generation (Fig. 7):** we generate...
Hi, You can ignore that part. I was just experimenting with a separated codebook (upper and lower), but I ended up not using it in the paper. Everything under [is_upperlower](https://github.com/exitudio/BAMM/blob/e6910b3c3d38b2e1ee131f96a74c7714a8419219/models/vq/model.py#L38)...
> > Hi, You can ignore that part. I was just experimenting with a separated codebook (upper and lower), but I ended up not using it in the paper. Everything...
Hi, You can find the training logs for the 1st, 2nd stages, and evaluation in the output folder if you have already downloaded them from [2.3. Pre-trained models]. Alternatively, you...
Hi, We don't support other dataset. But you can modify the code here [1](https://github.com/exitudio/MMM/blob/main/dataset/dataset_VQ.py#L17) [2](https://github.com/exitudio/MMM/blob/main/train_vq.py#L48) [3](https://github.com/exitudio/MMM/blob/main/train_t2m_trans.py#L56C69-L56C78).
Hi, Here is the log before the code cleanup. I am re-training to verify again. ``` 2023-10-12 10:11:15,494 INFO { "batch_size": 512, "block_size": 51, "clip_dim": 512, "code_dim": 32, "dataname": "t2m",...
1. No. It's save from the last epoch. 2. If you intend to train a Transformer, only load the pretrained vqvae.pth model. Perhaps it has been overtrained.
Can you try to train longer? It seems like you're using a batch size of 128 with 30,000 epochs. I use batch size of 512 with 75,000 epochs. I also...
Hi, We reset codebook in quantization process [here](https://github.com/exitudio/MMM/blob/main/models/quantize_cnn.py#L69-L71).
Hi, we use [MASK] tokens for generation by iterative decoding and [PAD] tokens to fill up the shorter length samples. [PAD] tokens in CLIP model can be in similar manner....