MiniCPM-V icon indicating copy to clipboard operation
MiniCPM-V copied to clipboard

💡 [REQUEST] - V 2.6 finetune multiple images example

Open Mihaiii opened this issue 1 year ago • 2 comments

起始日期 | Start Date

No response

实现PR | Implementation PR

No response

相关Issues | Reference Issues

No response

摘要 | Summary

Since V2.6 now accepts multiple images per sample, could you please update the finetune readme with an example for how the data json should look like when doing LoRA finetuning with multiple images per sample in one single user turn?

基本示例 | Basic Example

缺陷 | Drawbacks

I'm not sure how to format the data for multi-image LoRA finetuning. An example would help.

未解决问题 | Unresolved questions

No response

Mihaiii avatar Aug 06 '24 23:08 Mihaiii

Closing as duplicate of #386 .

Mihaiii avatar Aug 06 '24 23:08 Mihaiii

I'm reopening this since #386 is now closed.

We have info on how to lora finetune on V2.5, but in V2.5 can't add multiple images per samples. In V2.6 we can (which is awesome!), but how to LoRA finetune with multiple images per sample?

Mihaiii avatar Aug 09 '24 13:08 Mihaiii

In V2.6 we can (which is awesome!) Hi @Mihaiii , can you advise how to fine-tune with multiple images in V2.6, I'm struggling to see how to do it?

In MiniCPM-V/finetune/dataset.py, I see it opens a single image per conversation

image = Image.open(self.raw_data[i]["image"]).convert("RGB")

chris-tng avatar Aug 11 '24 17:08 chris-tng

@chris-tng I know it's possible, but I don't know how to do it. This is why I opened this issue: I'm also looking for a script with an example and how to prepare the data.

Mihaiii avatar Aug 11 '24 21:08 Mihaiii

@Mihaiii In this thread https://github.com/OpenBMB/MiniCPM-V/issues/386, @qyc-98 has commented this will be supported in near future, so I assume it's not supported now

chris-tng avatar Aug 12 '24 17:08 chris-tng

@chris-tng I see, thanks! I thought the message meant they will provide the scripts for finetuning on multiple images in the near future, not that it's not supported atm. But it's strange it's not supported, given the training was done with multiple images per sample.

Mihaiii avatar Aug 12 '24 17:08 Mihaiii

It's a bit weird that the scripts in the repo are set to v2.6.

2U1 avatar Aug 13 '24 04:08 2U1

looks like latest commit enables support for multi-image fine-tuning, I will give it a try

https://github.com/OpenBMB/MiniCPM-V/commit/cd64150b5122f8ee8c677d481c97918485129b52

chris-tng avatar Aug 15 '24 05:08 chris-tng