CogView2
CogView2 copied to clipboard
Hi can your share your training code?
I wanna train/finetune over my own dataset, but I didn't find the training code:)
hi, pretrain_coglm.py is the pretraining code (first-stage). The code for the second stage is a bit messy and not very useful for finetuning, so I didn't push them. The input format is a binarydataset generated by cogdata IcetkImageTextTask.
hi, pretrain_coglm.py is the pretraining code (first-stage). The code for the second stage is a bit messy and not very useful for finetuning, so I didn't push them. The input format is a binarydataset generated by cogdata IcetkImageTextTask.
Thanks, great job! I am interesting in how you train the auto-encoder, but I did not find the loss function of AE in pretrain_coglm.py, is there something I missing? I really wanna know how you guys use MS-SSIM and perceptual loss in this work, I think the weight of different loss-item maybe matter in some way?
Hi, the tokenizer is not included in this project. It is in icetk. But we also did not release the training code for the tokenizer. MS-SSIM and perceptual loss are used in the most straightforward way -- add them to the reconstruction loss.
Thanks for your reply! But the perceptual loss can be extremely huge(i.e >1e5) and ms-ssim seems in range [0,1], did you observe MS-SSIM really work when directly sum of them? I'm pretty interesting in this, because the ms-ssim seems , if you guys didn't notice this, I will test it by my own:)
Thanks for your reply! But the perceptual loss can be extremely huge(i.e >1e5) and ms-ssim seems in range [0,1], did you observe MS-SSIM really work when directly sum of them? I'm pretty interesting in this, if you guys didn't notice this, I will test it by my own:)
@DRJYYDS Hi, @minkowski0125 trained the tokenizer, he will provide more details.
Maybe carefully tuning the scales could be better, but the MS-SSIM is not very important in our experiments. If you have any conclusion in your test, I will be grateful if you can share it to us.