Li Hui comments

Results 42 comments of


                                            Li Hui

[Bug] MLA slower than default for small context long outputs and generating bad output reproducibly

I have the same problem, when I open flashinfer MLA.

[Bug] MLA slower than default for small context long outputs and generating bad output reproducibly

> [@pseudotensor](https://github.com/pseudotensor) [@Hugh-yw](https://github.com/Hugh-yw) [@lambert0312](https://github.com/lambert0312) The issue of bad output should be fixed by [#3785](https://github.com/sgl-project/sglang/pull/3785), please stay tuned! Great work @Fridge003

[Track] DeepSeek V3/R1 nextn progress

I see `compatible with radix cache and chunked prefill`. How is it going? Long context scenarios require this feature. @zhyncs

[Feature] DeepSeek V3 optimization

The overlap scheduler with DP attention can not be used on A800 * 4., because always OOM.

[Feature] DeepSeek V3 optimization

[DeepSeek MTP spec decode #12755](https://github.com/vllm-project/vllm/pull/12755) is Implement DeepSeek MTP: https://github.com/vllm-project/vllm/issues/12181 to support DeepSeek MTP layers for next n prediction.

[Feature] DeepSeek V3 optimization

This is https://github.com/CentML's implementation of DeepSeek MTP modules that enable speculative decoding for DeepSeek-R1. https://github.com/vllm-project/vllm/pull/12915

[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x)

Great work!

[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x)

> Thank you ! I am working on ROCM (MI210) platform. Will update soon. Can you verify the A800 environment? @yiakwy-xpu-ml-framework-team

[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x)

> > > Thank you ! I am working on ROCM (MI210) platform. Will update soon. > > > > > > Can you verify the A800 environment? @yiakwy-xpu-ml-framework-team >...

[MOE] enable efficient moe_alignment multi-blocks execution (3x~6x)

@yiakwy-xpu-ml-framework-team I rebuilt the kernel using the new code, and the following error occurred when starting: ``` [2025-02-19 02:32:32 TP29] Scheduler hit an exception: Traceback (most recent call last): File...