TonyUSTC

Results 5 issues of TonyUSTC

### Question When using Lora weights for inference, should the model_base be chosen as llava-v1.5-13b or vicuna-13b-v1.5? What are the differences between them?

代码第265行,多卡数据同步之后,cross_targets计算方式有问题,应该得考虑当前local rank。 https://github.com/FlagOpen/FlagEmbedding/blob/97f57a1b92dc68d56731a1e38a2d3aad4cd67e20/FlagEmbedding/BGE_M3/modeling.py#L265 原始是:cross_targets = idxs_cross * (cross_p_dense_vecs.size(0) // cross_q_dense_vecs.size(0)) 应该是:cross_targets = idxs_cross * (cross_p_dense_vecs.size(0) // cross_q_dense_vecs.size(0))+self.process_rank*p_dense_vecs.size(0)

遇到一个诡异的情况,qlora微调llama3-8b模型,单卡可以加载模型运行,多卡在load权重的时候就OOM了,use_unsloth设置为false, 观察显存占用,发现只有gpu0显存一直在涨,直到OOM.

### Your current environment python: 3.8 cuda: 11.8 vllm: 0.5.5+cu118 ### Model Input Dumps _No response_ ### 🐛 Describe the bug my llm model is qwen2 1.5b,so i want to...

bug

![image](https://github.com/user-attachments/assets/f40d74bd-2507-4dd8-9de8-b20159a29a34) 每次跑到step:583的时候,就报错了。