YuzaChongyi
YuzaChongyi
It's looks like something wrong with the input data, can you share your training example?
Sorry for the extra line break in the code format, we have fixed the code, you can try it again.
你好,可以给一下的训练数据示例吗?
你好,可以试一下我们提供的更简易的 [finetune](https://github.com/OpenBMB/MiniCPM-V/tree/main/finetune) 代码
we have updated the [TrainingArguments](https://github.com/OpenBMB/MiniCPM-V/blob/main/finetune/finetune.py#L49), please fetch the latest finetune code.
It means you should modify the chat_template to [inference chat_template](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/blob/main/tokenizer_config.json#L2059), because we don't need the extra `assistant\n\n ` during training. You can replace the chat_template in the file `tokenizer_config.json` manually.
This is the chat_template for inference. > `"chat_template": "{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '' + message['role'] + '\n\n'+ message['content'] | trim...
你好,minicpm-v 是多模态大模型,因此显存占用跟语言模型有一些区别,模型在进行视觉编码的时候也会占用一部分显存,并且模型内置了高清图编码策略,也就是输入图片是高分辨率时,会通过 slice 操作把图片划分成多个 patch, 这一步骤会显著增加 vision encoder 的输入长度,同时也会占用更多的显存。 此外,你可以通过减小 `model_max_length` 来降低显存占用。
It seems a bit strange, can you share your training parameters such as batch size and learning rate? Normally, if you observe that your training loss is steadily declining, the...
> Thanks for the update. What's the minium size (QA pairs) of test dataset to get some effect? Normally, for general tasks, thousands of QA pairs with 3 epochs is...