CogView2 icon indicating copy to clipboard operation
CogView2 copied to clipboard

Hi can your share your training code?

Open DRJYYDS opened this issue 2 years ago • 5 comments

I wanna train/finetune over my own dataset, but I didn't find the training code:)

DRJYYDS avatar Jun 22 '22 05:06 DRJYYDS

hi, pretrain_coglm.py is the pretraining code (first-stage). The code for the second stage is a bit messy and not very useful for finetuning, so I didn't push them. The input format is a binarydataset generated by cogdata IcetkImageTextTask.

Sleepychord avatar Jun 22 '22 06:06 Sleepychord

hi, pretrain_coglm.py is the pretraining code (first-stage). The code for the second stage is a bit messy and not very useful for finetuning, so I didn't push them. The input format is a binarydataset generated by cogdata IcetkImageTextTask.

Thanks, great job! I am interesting in how you train the auto-encoder, but I did not find the loss function of AE in pretrain_coglm.py, is there something I missing? I really wanna know how you guys use MS-SSIM and perceptual loss in this work, I think the weight of different loss-item maybe matter in some way?

DRJYYDS avatar Jun 22 '22 07:06 DRJYYDS

Hi, the tokenizer is not included in this project. It is in icetk. But we also did not release the training code for the tokenizer. MS-SSIM and perceptual loss are used in the most straightforward way -- add them to the reconstruction loss.

Sleepychord avatar Jun 22 '22 07:06 Sleepychord

Thanks for your reply! But the perceptual loss can be extremely huge(i.e >1e5) and ms-ssim seems in range [0,1], did you observe MS-SSIM really work when directly sum of them? I'm pretty interesting in this, because the ms-ssim seems , if you guys didn't notice this, I will test it by my own:)

Thanks for your reply! But the perceptual loss can be extremely huge(i.e >1e5) and ms-ssim seems in range [0,1], did you observe MS-SSIM really work when directly sum of them? I'm pretty interesting in this, if you guys didn't notice this, I will test it by my own:)

DRJYYDS avatar Jun 22 '22 10:06 DRJYYDS

@DRJYYDS Hi, @minkowski0125 trained the tokenizer, he will provide more details.

Maybe carefully tuning the scales could be better, but the MS-SSIM is not very important in our experiments. If you have any conclusion in your test, I will be grateful if you can share it to us.

Sleepychord avatar Jun 22 '22 13:06 Sleepychord