Lil2J

Results 27 comments of Lil2J

We have successfully trained the Eagle3 versions of Qwen3-8B and Qwen3-30B-A3B based on the official training code, and have open-sourced them. On a single H200 GPU using the sglang inference...

@garycaokai @luoruijie @chtaihei-ust-hk

We have successfully trained the Eagle3 versions of Qwen3-8B and Qwen3-30B-A3B based on the official training code, and have open-sourced them. On a single H200 GPU using the sglang inference...

We have successfully trained the Eagle3 versions of Qwen3-8B and Qwen3-30B-A3B based on the official training code, and have open-sourced them. On a single H200 GPU using the sglang inference...

My machine has 8 × B200 GPUs, but I only used one B200.

Thank you very much for your reply. I’m also looking into the kvcached code and would like to contribute to fixing this bug. From my perspective, this project truly has...

> Thanks for digging into the code! We totally agree that quantization is a must. We'd love to collaborate if you are interested in helping with the integration. Please feel...

We have successfully trained the Eagle3 versions of Qwen3-8B and Qwen3-30B-A3B based on the official training code, and have open-sourced them. On a single H200 GPU using the sglang inference...