InternEvo
InternEvo copied to clipboard
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
### Describe the question. A typo is found when loading and saving scheduler states: https://github.com/InternLM/InternEvo/blob/be3291090bf080984d98bb811143288057d492ee/internlm/utils/model_checkpoint.py#L857 Maybe it's a better option to aggregate all constant variables into a single module and...
### 描述该功能 目前能编译出来的windows版本的flash-attention是依赖cu121+py310+torch2.1 而InternEvo又只依赖cu118,导致两个库冲突了,无法在windows上训练 未来会有计划升级到cu121吗?谢谢! ### 是否希望自己实现该功能? - [ ] 我希望自己来实现这一功能,并向 InternLM 贡献代码!
### 描述问题 请问基于一个书生格式的ckpt希望做张量并行或流水线并行,需要自己切分模型以适配对应的并行度吗? 直接加载会报错,官方有提供相应的脚本吗?
### Describe the bug  已经按照安装指引安装好了环境,在pretrain的时候报这个错误 ### Environment 环境完全按照 https://github.com/InternLM/InternEvo/blob/develop/doc/install.md 安装 ### Other information _No response_
### Describe the feature Should we remove other dependency of flash-attention, and only keep the core attention related ops? If possible, we can only use pip to install flash-attention, avoiding...
### 描述该错误 非常感谢您的工作! 我在使用代码进行sft时遇到了一个问题。在不使用moe的config时能够很好的运行,在使用moe的config文件后报错。 运行代码: ``` torchrun --nnodes=1 --nproc_per_node=8 train.py --config ./configs/7B_MoE4_sft.py --launcher "torch" ``` 报错信息: ``` Traceback (most recent call last): File "train.py", line 324, in main(args) File "train.py",...
### 描述该功能 hi there, could you give some suggestions for training small model size, such as 1B or 3B, and related configurations? thanks a ton! ### 是否希望自己实现该功能? - [ ]...
### 描述该功能 https://github.com/InternLM/InternEvo/blob/d3fabf84f1e6974b0b82ff2cf8685067792824ec/web_demo.py#L30 ### 是否希望自己实现该功能? - [ ] 我希望自己来实现这一功能,并向 InternLM 贡献代码!
### Describe the feature CI should have a true no flashattention env ### Will you implement it? - [X] I would like to implement this feature and create a PR!