Label Knight

Results 5 issues of Label Knight

batchsize默认64,如果改成小一点的,每一次的datasets都没变。

### Is there an existing issue for this? - [X] I have searched the existing issues ### Description of the Build Error build from source ERROR pipe branch may error...

Build ERR

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction https://github.com/hiyouga/LLaMA-Factory/issues/3510 看下面的图,反向是用的fp16的,计算之后才后cast为32的。 从下面微软介绍deepspeed的视频也可以看到,反向的时候,用的也是16。 https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/ 这里提到过,但是我不知道如何reopen issue,所以新提了一个。 ### Expected behavior _No response_ ### System Info...

enhancement
pending

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction no ### Expected behavior flash atten 通过args设置失败 transformers (v4.37.2之后)中: transformers/modeling_utils.py ` @classmethod def _from_config(cls,...

pending

### System Info / 系統信息 1 ### Who can help? / 谁可以帮助到您? _No response_ ### Information / 问题信息 - [ ] The official example scripts / 官方的示例脚本 - [ ]...