Dian comments

Results 12 comments of


                                            Dian

will the training code and dataset be released publicly?

Thanks for the quick response and clarification. For the dataset preparation, I probably had misunderstood what the paper said. ![image](https://github.com/salesforce/LAVIS/assets/5018331/52b39e03-06ab-466b-adb0-b04f0dbd7177) And, how about the training config/code? ![image](https://github.com/salesforce/LAVIS/assets/5018331/1c4baf52-e145-4ae1-ae28-b2f549f37e2b) > As described...

The ITM and LM loss of BLIP do not converge

> Thanks for your reply. It seems that the difference between Chinese and English BERT did cause this problem. I lowered the learning rate and ITM and LM are currently...

any plan to open source blip2 training code for vit-L/14 image encoder?

> You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml thanks for the insight. also, I downloaded Laion-multi dataset to do my own training, and find the epoch...

any plan to open source blip2 training code for vit-L/14 image encoder?

> You can modify this config file to pre-train with ViT-L: https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml Also, do you know why [pretrain_stage1.yaml](https://github.com/salesforce/LAVIS/blob/main/lavis/projects/blip2/train/pretrain_stage1.yaml) has no validation dataset? only training dataset is used... In this case,...

请问这个的效果和vicuna相比怎么样

> @rayvzn119 standford的vicuna主要是全量微调+不开8bit+全长度（2048），他们之前的那个效果一般，不过在最近的V1.1版本效果挺好的，基底是13B模型。我们主要在7B上的模型+lora+8bit上进行训练，由于资源所限，我们目前的目标还是在小资源下如何提高中文能力。目前中文能力确实是不如他们的13B的V1.1版本。 > @fireice009 可以参考这个[issue](https://github.com/Facico/Chinese-Vicuna/issues/48) 期待基于13B做一个效果更好的~

Dian

will the training code and dataset be released publicly?

The ITM and LM loss of BLIP do not converge

any plan to open source blip2 training code for vit-L/14 image encoder?

any plan to open source blip2 training code for vit-L/14 image encoder?

请问这个的效果和vicuna相比怎么样

About itm loss

InstructBLIP generates short and repeated sentence.

请教一下默认的image encoder是什么？

Implement Exponential Backoff

many write error while using oss (s3-like) remote bucket storage