zxy issues

Results 7 issues of

zxy

CUDA error: an illegal memory access was encountered

Thank you for your excellent work! Currently, I am trying to reproduce KVQaunt but have encountered some errors. Your assistance with this matter would be appreciated. ### 1. Reproduce the...

Question on H2O experiment reproduction

Thanks for your excellent work! As stated in the paper Table 1: "Performance comparison of SnapKV and H2O across various LLMs on LongBench", could you provide the scripts/codes for reproducing...

[Feature] metrics support

## Objective Align with [vLLM v1 metrics system](https://docs.vllm.ai/en/latest/design/v1/metrics.html) and beyond. We also refer to [SGLang monitoring](https://github.com/sgl-project/sglang/blob/1ab14c4c5c67d0577451764f4a77d685a7dc2db4/examples/monitoring/README.md). ## TODO - [x] Change `time.perf_counter()` - [ ] Abstract output processing outside of...

WIP

add deepseekv3 doc

Just as the title goes.

WIP

documentation

zxy

CUDA error: an illegal memory access was encountered

Question on H2O experiment reproduction

[Feature] metrics support

add deepseekv3 doc

Fix ep deployment issues

[POC] Encoder Disaggregation

quant blocked fp8