InternEvo issues

[Typo] `schedulder` -> `scheduler`

1

### Describe the question. A typo is found when loading and saving scheduler states: https://github.com/InternLM/InternEvo/blob/be3291090bf080984d98bb811143288057d492ee/internlm/utils/model_checkpoint.py#L857 Maybe it's a better option to aggregate all constant variables into a single module and...

Spico197

question

升级CUDA版本以支持Windows版本的flash-attention

1

### 描述该功能目前能编译出来的windows版本的flash-attention是依赖cu121+py310+torch2.1 而InternEvo又只依赖cu118，导致两个库冲突了，无法在windows上训练未来会有计划升级到cu121吗？谢谢！ ### 是否希望自己实现该功能？ - [ ] 我希望自己来实现这一功能，并向 InternLM 贡献代码！

SkyblueMr

enhancement

[QA] 关于使用张量并行或流水线并行的模型切分与合并问题

### 描述问题请问基于一个书生格式的ckpt希望做张量并行或流水线并行，需要自己切分模型以适配对应的并行度吗？直接加载会报错，官方有提供相应的脚本吗？

BaiBlanc

question

[Bug] AssertionError: Only flash cross entropy support parallel_output

1

### Describe the bug ![image](https://github.com/InternLM/InternEvo/assets/54690997/7329391a-4c6e-4667-a56a-695a761ca397) 已经按照安装指引安装好了环境，在pretrain的时候报这个错误 ### Environment 环境完全按照 https://github.com/InternLM/InternEvo/blob/develop/doc/install.md 安装 ### Other information _No response_

wen020

bug

Add tool for data cleaning

### Describe the question. suggest add tool for data cleaning

www516717402

question

[Feature] Should we remove other dependency of flashattention?

### Describe the feature Should we remove other dependency of flash-attention, and only keep the core attention related ops? If possible, we can only use pip to install flash-attention, avoiding...

sunpengsdu

enhancement

[Bug] 使用moe的config微调报错

1

### 描述该错误非常感谢您的工作！我在使用代码进行sft时遇到了一个问题。在不使用moe的config时能够很好的运行，在使用moe的config文件后报错。运行代码： ``` torchrun --nnodes=1 --nproc_per_node=8 train.py --config ./configs/7B_MoE4_sft.py --launcher "torch" ``` 报错信息： ``` Traceback (most recent call last): File "train.py", line 324, in main(args) File "train.py",...

wang-benqiang

bug

[Feature] Support customized model size for training

### 描述该功能 hi there, could you give some suggestions for training small model size, such as 1B or 3B, and related configurations? thanks a ton! ### 是否希望自己实现该功能？ - [ ]...

varuy322

enhancement

[Feature] Recommend to replace internlm1.1 with internlm2 in web-demo

### 描述该功能 https://github.com/InternLM/InternEvo/blob/d3fabf84f1e6974b0b82ff2cf8685067792824ec/web_demo.py#L30 ### 是否希望自己实现该功能？ - [ ] 我希望自己来实现这一功能，并向 InternLM 贡献代码！

del-zhenwu

enhancement

[Feature] CI should have a true no flashattention env

1

### Describe the feature CI should have a true no flashattention env ### Will you implement it? - [X] I would like to implement this feature and create a PR!

sunpengsdu

bug

enhancement

InternEvo
InternEvo copied to clipboard

Metadata

[Typo] `schedulder` -> `scheduler`

升级CUDA版本以支持Windows版本的flash-attention

[QA] 关于使用张量并行或流水线并行的模型切分与合并问题

[Bug] AssertionError: Only flash cross entropy support parallel_output

Add tool for data cleaning

[Feature] Should we remove other dependency of flashattention?

[Bug] 使用moe的config微调报错

[Feature] Support customized model size for training

[Feature] Recommend to replace internlm1.1 with internlm2 in web-demo

[Feature] CI should have a true no flashattention env

← Metadata

Owner

Metadata

InternEvo InternEvo copied to clipboard

Metadata

← Metadata

Owner

Metadata

InternEvo
InternEvo copied to clipboard