LLaVA [Question] How to evaluate pretraining[image-text alignment] performance?

[Question] How to evaluate pretraining[image-text alignment] performance?

Open enkaranfiles opened this issue 9 months ago • 0 comments

Question

I have trained the vision tower module by replacing another vision encoder and gathering new custom data from another domain. But I wonder how I can evaluate pretraining performance since it is the crucial part for image-text alignment, it must be consider? Anyone who can response, thanks!

Apr 29 '24 07:04 enkaranfiles

LLaVA LLaVA copied to clipboard

[Question] How to evaluate pretraining[image-text alignment] performance?

Question

LLaVA
LLaVA copied to clipboard