Haosheng Zou (邹昊晟)

Results 5 issues of Haosheng Zou (邹昊晟)

config.json里的seq_length是否可以完全代表预训练时的窗口长度?72B从头开始用32k窗口训的吗?

根据config.json,是32k内正常推,32k外dynamic ntk? 大海捞针200k的时候也是这么推理的吗? 初步用lightllm实现了“32k内正常推,32k外dynamic ntk”的推理方法,测了internlm2-chat-7b在原版大海捞针200k长度时的几个case,胡乱输出英文。是不是用错了模型或者推理setting有误?scaling_factor要额外设置吗? 如何复现公众号里7B-200k的大海捞针结果?

question

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-Alignment/safe-rlhf/discussions) that this hasn't already been reported. (+1 or comment...

question

Any plans on releasing the DPO code, or a brief intro of how you conducted long-context DPO?

# What does this PR do? add Sequence Parallelism (#4733 #5024 #5207 #5815 #5841 etc.) direct plug&play use at https://github.com/Qihoo360/360-LLaMA-Factory We have a separate README and chat-group at https://github.com/Qihoo360/360-LLaMA-Factory, only...

pending