EAGLE icon indicating copy to clipboard operation
EAGLE copied to clipboard

What is the data format of the training data?

Open BucherLi opened this issue 6 months ago • 6 comments

Is the data format of the training data the same as the sharegpt format? Image

BucherLi avatar Jun 11 '25 02:06 BucherLi

same

charmway avatar Jun 13 '25 06:06 charmway

jsonl not json

Siegfried-qgf avatar Jun 13 '25 11:06 Siegfried-qgf

Common multi-turn sft training data and generated by target model itself

xinlong-yang avatar Jun 15 '25 05:06 xinlong-yang

We have successfully trained the Eagle3 versions of Qwen3-8B and Qwen3-30B-A3B based on the official training code, and have open-sourced them. On a single H200 GPU using the sglang inference framework, Qwen3-8B with Eagle3 achieves a performance boost from 186 tokens/second to 365 tokens/second, while Qwen3-30B-A3B with Eagle3 improves from 147 tokens/second to 231 tokens/second.

We used the ultra_200k test set and re-ran inference on Qwen3 to regenerate the data, which was then used as the final training set.A total of 600K dialogues were used as the training set.

https://huggingface.co/Tengyunw/qwen3_30b_moe_eagle3

https://huggingface.co/Tengyunw/qwen3_8b_eagle3

Additionally, we have also published a report detailing how to reproduce the Eagle3 training process. The report link is provided below for your reference if needed.

https://mp.weixin.qq.com/s/Dmdg6aLgFHZEcm6TY1vKkA

https://zhuanlan.zhihu.com/p/1923763301432662012

jiahe7ay avatar Jul 02 '25 08:07 jiahe7ay

@BucherLi @charmway

jiahe7ay avatar Jul 02 '25 08:07 jiahe7ay

We have now open-sourced our training data regenerated using Qwen3-8B. https://huggingface.co/datasets/Tengyunw/qwen3_8b_eagle3

jiahe7ay avatar Jul 04 '25 04:07 jiahe7ay