Kingsley
Kingsley
I think it is because we add a fake audio input into the batch features when the whole batch doesn't contain 'audio' input. https://github.com/hiyouga/LLaMA-Factory/blob/5817cda37ec2ede2d94347f168f2c75241dd21ad/src/llamafactory/data/collator.py#L130
> [@Kuangdd01](https://github.com/Kuangdd01) Thanks again! One follow-up question: > > Is adding the fake audio input strictly necessary? It seems like, based on the official inference code, it's also possible to...
deepspeed z3要把模型部分也广播了 卡间通信量可能很大
Insert `` into every sample in the question column.
For example, you should have a in your question when you made up your vqa dataset. "`+`Please parse the parent-child one to one relationships from the attaches directed structure chart...
pip install git+https://github.com/Kuangdd01/transformers.git@qwen25omni 暂时先用这个试一下
参考data/mllm_audio_demo.jdon组织一下数据格式
> 训练参数为: > llamafactory-cli train ` > --stage sft ` > --do_train True ` > --model_name_or_path C://Users//76425//Desktop//LLaMA-Factory//models//Qwen2.5-Omni-3B ` > --preprocessing_num_workers 16 ` > --finetuning_type lora ` > --template qwen2_omni `...
数据请参考mllm_audio_demo,不需要自己加特殊token类似
accelerate run用的也是同一份yaml还是用的fsdp2,换成dsz3再试试。