Yineng Zhang
Yineng Zhang
### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. -...
### Motivation Hi all. @lvhan028 @lzhangzz @AllentDan When I want to use `lite autoawq` with Llama 3.1 405B Instruct, it takes a very long time. ```bash python3 -m lmdeploy lite...
ref https://github.com/sgl-project/sglang/issues/913 https://github.com/triton-lang/triton/pull/4492 @Jokeren cc @ispobock @merrymercy ## env ``` Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] CUDA available: True GPU 0,1,2,3,4,5,6,7: NVIDIA H100 80GB HBM3 GPU 0,1,2,3,4,5,6,7...
latest main, A100 ```bash for ele in $(ls); do python3 -m pytest ${ele}; done ``` ``` =================================================================================== test session starts =================================================================================== platform linux -- Python 3.12.4, pytest-8.3.2, pluggy-1.5.0 rootdir: /flashinfer/python...
### Checklist - [ ] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. -...
### Checklist - [X] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. - [X]...
### Checklist - [X] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. - [X]...
Hi bRPC developers @chenBright @wwbmmm @wasphin @thorneliu As titled, is bRPC compatible with the Stream feature of gRPC Python client? The main use case is similar to the SSE feature...
### Checklist - [ ] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest...
## Triton Backend @ispobock @pankajroark - [x] [refactor triton backend 1](https://github.com/sgl-project/sglang/pull/3292), [2](https://github.com/sgl-project/sglang/pull/3309) - [x] [support custom mask](https://github.com/sgl-project/sglang/pull/3317) - [x] [support EAGLE 2](https://github.com/sgl-project/sglang/pull/3466) - [x] [compatible with CUDA Graph](https://github.com/sgl-project/sglang/pull/3500) - [x]...