DarkSharpness comments

Results 15 comments of


                                            DarkSharpness

any diffusion model support plan？

Hi. At this moment, we don't have plans to support diffusion models. Compared to language model serving, diffusion workloads are quite different and introduce significantly more complexity, and it's non-trivial...

[Feature] Implement variable page size support

Thanks. Actually, when `page_size >1`, the page indices allocation logic is quite different. The page indices must be (de)allocated at a granularity of `page_size`. It's much trickier and does not...

[Feature] Implement variable page size support

@DhiraPT Yes. For future support of MLA models, as popular attention implementation like `FlashMLA` and `trtllm_mla_decode` (from flashinfer) requires a fixed page size of 64 or 128, we need this...

[Feature] Add MLA configuration and KV cache storage kernel

LGTM. Will get it merged after we implement MLA model.

Add AVX2 SIMD optimization for radix tree key comparison

Thanks. Personally I think this is too heavy for mini-sglang. In addition, I'm not sure whether it will bring concrete performance gain in a real-world setting. Since we already have...