InternEvo icon indicating copy to clipboard operation
InternEvo copied to clipboard

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

Results 79 InternEvo issues
Sort by recently updated
recently updated
newest added

### Describe the bug ### Environment Torch2.1 ### Other information _No response_

bug

### Describe the bug we have a lot of cases like following: ` data = torch.empty(partition_size, dtype=tensor.dtype, device=torch.cuda.current_device(), requires_grad=False) ` where we directly use device=torch.cuda.current_device(). However, it is not recommended...

bug

### Describe the feature Some CPU synchronizations block the GPU kernel, leading to bubbles between GPU kernels. It should be optimized in the future. 1. item() in rotary embedding. 2....

enhancement

### Describe the feature update readme with new version of dependency. ### Will you implement it? - [ ] I would like to implement this feature and create a PR!

enhancement

### Describe the feature supporting hugging-face modeling python file ### Will you implement it? - [ ] I would like to implement this feature and create a PR!

enhancement

### Describe the feature they should not in separated parameter group ### Will you implement it? - [X] I would like to implement this feature and create a PR!

enhancement

### 描述该错误 make -f docker.Makefile BASE_OS=ubuntu20.04 时,总是会出一个错误,无法解决。发生在[intrenlm-dev 3/3] RUN git submodule update --init --recursive 这一步 ### 环境信息 ERROR: failed to solve: process "/bin/sh -c git submodule update --init --recursive &&...

bug

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

### Describe the feature internlm2-1.8b finetuning config is missing ### Will you implement it? - [X] I would like to implement this feature and create a PR!

enhancement

### Describe the feature 目前 rotary_embedding类的实现有大量历史遗留的代码,建议和https://github.com/Dao-AILab/flash-attention/blob/v2.2.1/flash_attn/layers/rotary.py 对齐,并且支持triton算子 ### Will you implement it? - [ ] I would like to implement this feature and create a PR!

enhancement