DefTruth comments

Results 256 comments of


                                            DefTruth

[Bug]: AssertionError, assert prefill_metadata.context_chunk_seq_tot is not None

Runtime error when running MLA models with "prefix caching enabled" and "chunked prefill disabled"

[Bug]: AssertionError, assert prefill_metadata.context_chunk_seq_tot is not None

> [@DefTruth](https://github.com/DefTruth) can you try? [#14255](https://github.com/vllm-project/vllm/pull/14255) just opened a PR based on [@ZhongYingMatrix](https://github.com/ZhongYingMatrix) 's diff Of course!

[Bug]: AssertionError, assert prefill_metadata.context_chunk_seq_tot is not None

> > [@DefTruth](https://github.com/DefTruth) can you try? [#14255](https://github.com/vllm-project/vllm/pull/14255) just opened a PR based on [@ZhongYingMatrix](https://github.com/ZhongYingMatrix) 's diff > > Of course! it work

[V1] EP + DP Attention

can we use "EP/TP MoE + DP Attention" on V0 ?

[V1] EP + DP Attention

> > can we use "EP/TP MoE + DP Attention" on V0 ? > > No, DP is only added in V1. get ~

feat: introduce cache-dit to nunchaku

> If using lightning lora and cachedit in qwen image edit - 2509 with nunchaku, can it acclerate? sure, it supported. Please check the example for more details: https://github.com/vipshop/cache-dit/blob/main/examples/quantize/run_qwen_image_edit_plus_lightning_nunchaku.py