bonre
Results
2
comments of
bonre
Hello, I also encountered a similar problem here. The model I trained is InternVL2-8B, and the GPUs are 8*A100 40G. I have tried various methods for DPO training. Here are...
非常感谢贵组的辛苦工作! 针对第三点,请问是否有想法加入类似 channel loss 的观察功能呢?即针对不同下游任务的数据集单独观察loss变化趋势。我看2.5版本已经支持了对于 MLLM 的 PT,我想这个功能对于做 MLLM 的 Post Pre-Train 是比较重要的。望采纳 :>