DefTruth

Results 256 comments of DefTruth

Runtime error when running MLA models with "prefix caching enabled" and "chunked prefill disabled"

> [@DefTruth](https://github.com/DefTruth) can you try? [#14255](https://github.com/vllm-project/vllm/pull/14255) just opened a PR based on [@ZhongYingMatrix](https://github.com/ZhongYingMatrix) 's diff Of course!

> > [@DefTruth](https://github.com/DefTruth) can you try? [#14255](https://github.com/vllm-project/vllm/pull/14255) just opened a PR based on [@ZhongYingMatrix](https://github.com/ZhongYingMatrix) 's diff > > Of course! it work

can we use "EP/TP MoE + DP Attention" on V0 ?

> > can we use "EP/TP MoE + DP Attention" on V0 ? > > No, DP is only added in V1. get ~

> If using lightning lora and cachedit in qwen image edit - 2509 with nunchaku, can it acclerate? sure, it supported. Please check the example for more details: https://github.com/vipshop/cache-dit/blob/main/examples/quantize/run_qwen_image_edit_plus_lightning_nunchaku.py

@lmxyy Hi~ can you take a look to this PR?

> > > I'm in the chat group. The author has been pretty busy lately~P.S. Does using your PR require recompilation? not require recompilation

> I don't think any reinstallation or recompilation of nunchaku is required. It should be supported out of box, haven't tested though. > > Just install the dependency library https://pypi.org/project/cache-dit/...

> How to enable this in ComfyUi? we don't have workflows now