LAVIS
LAVIS copied to clipboard
BLIP2 fine-tuning on custom LLM+dataset
Hi!
I want to extend BLIP2 capabilities to another language. I have a pre-trained LLM (T5 family) and a dataset with image captions. Could you please help me understand my next steps to train the model? Do I need to perform pre-training, or can I use pre-trained ViT along with my T5 model and do fine-tuning?
Hey,
Looking to do a similar project. Did you ever found an answer to your question? Would love to get some clarification.
嘿,
想做一个类似的项目。您找到问题的答案了吗?希望得到一些澄清。
Hello! I am also working on fine-tuning the image capture task using my own dataset, BLIP2. Have you succeeded
Working on the same, did anyone have any luck?
any progress?
Can you share code?