Liang Shining comments

Results 10 comments of


                                            Liang Shining

Hey Friends! Ideas & Feature Requests for this repo!

同意楼上老哥的方法，公众号可以用于在各大社区网站做宣传，引流

运行错误，求教

@YasinZhao @loveJasmine 感谢两位的反馈，近期我会检查一下

运行错误，求教

> 不过我有个问题，rep主提到的self.attention部分的实现为何我没有找到，难道rep主把attention flow当做了self-attention？是的是的，在attention flow中加了一种self attention，跟现行的基于transformer的self attention不一样

Onnx Support for BGE-M3

> Hi, I just published ONNX version with scripts to do the ONNX conversion here: https://huggingface.co/aapot/bge-m3-onnx Thanks for your work. It seems like a cpu version, right?

add Sequence Parallelism

Hi @hiyouga Are there any merge blockers on this PR? I'm SFT qwen2.5 on a long context task and I think sequence parallel will much help to accelerate it. If...

add Sequence Parallelism

> @shiningliang This PR diverges from LLaMA-Factory's last release v0.9.1 For now, known errors with SP are with multi-modal data & models. Pure text models should work well. Hi @HaoshengZou...

AttributeError: 'Qwen2FlashAttention2' object has no attribute '_flash_attention_forward'

Same issue when trying Qwen2.5 sequence parallel and fix it by **downgrade transformers to 4.42.4**. Will the owner migrate the code to support different versions of transformers?

请问目前支持qwen2吗？

Sequence parallel needs transformers

transformers == 4.44.2 xtuner == 0.1.23 训练 qwen2 时报错

Sequence parallel needs transformers

ms-swift3 Suggestion Box

> > hello，训练流程会有tp、pp...的支持吗 > > megatron支持的优化会在ms-swift3.0重构后进行. 大概1-2个月后今年有希望支持吗 😂