MiniCPM-o icon indicating copy to clipboard operation
MiniCPM-o copied to clipboard

Can we use in-context multimodal data for finetuning?

Open waltonfuture opened this issue 8 months ago • 6 comments

Thanks for your great work! However, it seems that we can only use data that contains one image for SFT. Can we use in-context multimodal data (i.e., containing multiple images) for finetuning?

waltonfuture avatar Jun 06 '24 19:06 waltonfuture