HelloWorld506 issues

Results 2 issues of


                                            HelloWorld506

deepseek微调后进行推理输出混乱

### Reminder - [x] I have read the above rules and searched the existing issues. ### System Info 最新版llamafactory ### Reproduction 我微调了deepseek-qwen-7B模型，我的输出只有A，B，C，训练时准确率很高，但是推理时会输出思维链，甚至会有user类似的在input中的词，请问训练时是做了什么操作让其不输出思维链吗，另外推理时输出在input中的词是为什么呢，应该如何解决呢 ### Others _No response_

bug

pending

多节点使用zero3速度很慢

### Reminder - [X] I have read the README and searched the existing issues. ### System Info 多个节点，每个节点8张40G A100，训练72B模型 ### Reproduction 多节点使用zero3速度很慢我有多个节点，每个节点下面有8张A100，使用zero3时，72B模型被分到了所有节点的所有gpu上，即使使用了nvlink，但通信延迟仍然非常高，导致训练速度很慢，比单节点使用zero3+offload速度还慢很多，导致的问题是节点越多，训练速度反而更慢，不如只使用单节点然而实际上单节点的8张A100理论上就可以使用zero3放下整个72B模型了，若可以控制模型只在节点内被拆分，理论上可以减少通信，加快训练速度请问大佬们，有方法控制zero3只把模型参数分摊到同一节点内吗，每个节点都保存一个完整的模型，节点之间只用同步梯度，从而加快训练速度吗 ### Expected behavior _No response_...

pending