InternEvo
InternEvo copied to clipboard
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
## Motivation 1. The `mlp_layer_fusion` config is useful in MoE; therefore, a warning is added to recommend that users set this config to True in the MoE model. 2. When...
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...
### 描述问题 训练到六百多步,报Nan grad norm occurs, please check it。请问我该怎么调整?以下是报错信息和训练config。 报错信息: ``` 2024-11-24 02:35:28,531 INFO pipeline.py:770 in record_current_batch_training_metrics -- tflops=229.64161376788178 step=642 loss=2.1725409030914307 real_tgs=719.84 tgs (tokens/gpu/second)=720.46 tgs/last_tgs_1=720.46 tgs/tgs_all=716.63 tgs/tgs_avg=722.47 tgs/tgs_SMA=722.84 tgs/last_tgs_10=723.59 tgs/last_tgs_50=722.37...
### Describe the question. Thank you for great works. Question is the same as title.
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand...