InternEvo issues

fix(mlp): enhance mlp_layer_fusion

## Motivation 1. The `mlp_layer_fusion` config is useful in MoE; therefore, a warning is added to recommend that users set this config to True in the MoE model. 2. When...

yingtongxiong

[Feat] Heterogeneous Code Part 1: Add Model and Module Code for Chameleon Lumina

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

zhhsplendid

[QA] 训练到一半Nan grad norm occurs, please check it

### 描述问题训练到六百多步，报Nan grad norm occurs, please check it。请问我该怎么调整？以下是报错信息和训练config。报错信息： ``` 2024-11-24 02:35:28,531 INFO pipeline.py:770 in record_current_batch_training_metrics -- tflops=229.64161376788178 step=642 loss=2.1725409030914307 real_tgs=719.84 tgs (tokens/gpu/second)=720.46 tgs/last_tgs_1=720.46 tgs/tgs_all=716.63 tgs/tgs_avg=722.47 tgs/tgs_SMA=722.84 tgs/last_tgs_10=723.59 tgs/last_tgs_50=722.37...

wang-benqiang

question

[QA] Does internEvo support loongtrain selective checkpoint++?

1

### Describe the question. Thank you for great works. Question is the same as title.

wplf

question

feat(moe): support moe zero1 setting

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...

blankde

InternEvo
InternEvo copied to clipboard

Metadata

fix(mlp): enhance mlp_layer_fusion

[Feat] Heterogeneous Code Part 1: Add Model and Module Code for Chameleon Lumina

[QA] 训练到一半Nan grad norm occurs, please check it

[QA] Does internEvo support loongtrain selective checkpoint++?

fix(monitor): send exception when feishu alert is enable && remove light monitoring address

fix llava model device bugs

feat(moe): add gshard token rearrange optim

[QA] 如何进行单卡微调的，需要调整那些设置

feat(moe): support moe zero1 setting

← Metadata

Owner

Metadata

InternEvo InternEvo copied to clipboard

Metadata

← Metadata

Owner

Metadata

InternEvo
InternEvo copied to clipboard