Yineng Zhang

Results 34 issues of Yineng Zhang

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. -...

bug

### Motivation Hi all. @lvhan028 @lzhangzz @AllentDan When I want to use `lite autoawq` with Llama 3.1 405B Instruct, it takes a very long time. ```bash python3 -m lmdeploy lite...

ref https://github.com/sgl-project/sglang/issues/913 https://github.com/triton-lang/triton/pull/4492 @Jokeren cc @ispobock @merrymercy ## env ``` Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: NVIDIA H100 80GB HBM3 GPU 0,1,2,3,4,5,6,7...

enhancement

latest main, A100 ```bash for ele in $(ls); do python3 -m pytest ${ele}; done ``` ``` =================================================================================== test session starts =================================================================================== platform linux -- Python 3.12.4, pytest-8.3.2, pluggy-1.5.0 rootdir: /flashinfer/python...

### Checklist - [ ] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. -...

performance

### Checklist - [X] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. - [X]...

### Checklist - [X] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. - [X]...

Hi bRPC developers @chenBright @wwbmmm @wasphin @thorneliu As titled, is bRPC compatible with the Stream feature of gRPC Python client? The main use case is similar to the SSE feature...

### Checklist - [ ] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest...

high priority

## Triton Backend @ispobock @pankajroark - [x] [refactor triton backend 1](https://github.com/sgl-project/sglang/pull/3292), [2](https://github.com/sgl-project/sglang/pull/3309) - [x] [support custom mask](https://github.com/sgl-project/sglang/pull/3317) - [x] [support EAGLE 2](https://github.com/sgl-project/sglang/pull/3466) - [x] [compatible with CUDA Graph](https://github.com/sgl-project/sglang/pull/3500) - [x]...

enhancement
high priority
flashinfer
deepseek