OwalnutO comments

Results 10 comments of


                                            OwalnutO

Error from batch_norm

I also got stuck in this problem and solved it in another way. My tensorflow version is '0.12.1'. I replace the batch_norm class code in ops.py. with the code from...

Can't download the caption dataset

Oh yeah, I have solved it. Thanks!

Hi, thanks for your wonderful code~ One problem. I find that the image naming format is different between your work and PG2 (https://arxiv.org/pdf/1705.09368.pdf), you name the images as 'fashion+men/women+cloth+id+view.jpg', but...

What's the format of training/testing pairs and annotation files?

Is there anyone figure out how to make these csv file? Thanks~

how to use autoencoder and ldm model ?

> @forever208 thanks, I do. For my application I had to retrain though since I'm using more channels. Hi~I also try to train the AE on my own dataset but...

How to set the base_learning_rate in finetune autoencoderKL?

Set `--scale_lr=False`, then the learning rate will not be modified.

Question about the transform in data loader_imagenet_dct.py

hah, same question. I guess, that this version is not a complete version. Hope the author can provide the full training code.

Image quality degradation with Flux kontext

Same problem. The inference pipeline may contain some crop and padding operations. I guess this is the main reason.

Summary of CogVideoX-5B-I2V-v1.5 inference and fine-tuning about `vae_scaling_factor_image` and vertical video

For the first question, It seems that, for I2V model, the input image condition should not multiply the scale. Therefore, during training, the video latent should multiply the scale, but...

Finetuning CogVideoX-t2v-5B takes a very long time, even on 8xH100 GPUs

> When fine-tuning CogVideoX-5B on my own dataset, I've also encountered the same problem where the loss is noisy and doesn't go down. Have you discovered what the issue might...