Kevin_Xiong

Results 3 issues of Kevin_Xiong

Hi, thanks for maintaining this repo. It's been really helpful for my project. But I find something confusing compared to the author's implementation. In the author's code, each GCN receives...

## Motivation According to Deepseek’s official [documentation](https://api-docs.deepseek.com/quick_start/rate_limit), for non-streaming requests, the API continuously returns empty lines to enhance the user experience of reasoning models. ## Modifications This pull request modifies...

Change singe_layer_transfer kernel to support MLA copy and simplify the code using templates with some ut. Verified deepseekV2-lite locally.