Yihua Cheng comments

Results 77 comments of


                                            Yihua Cheng

[BUG] Python packages missing for unit tests

Good question. I thought vLLM is using a docker image for testing (but their unit tests takes forever to run..) Maybe another solution is to pre-install all the needed non-python...

The number of Falses in the mask is not a multiple of the chunk size.

This should be a bug. We will try fixing this soon.

The number of Falses in the mask is not a multiple of the chunk size.

@YaoJiayi

The number of Falses in the mask is not a multiple of the chunk size.

@YaoJiayi I found there is something hard coded here: https://github.com/LMCache/LMCache/blob/e9cb5189a82329877402c11c14728e4c0df1afa1/lmcache/integration/vllm/vllm_v1_adapter.py#L403 Is this the core reason?

[BUG]nixl._bindings.nixlBackendError: NIXL_ERR_BACKEND

@cotol7 NIXL depends on UCX and directly do `pip install nixl` does not include UCX installation. You can use NIXL or dynamo's docker image, or cuda-dl-base docker image, or compile...

[BUG]nixl._bindings.nixlBackendError: NIXL_ERR_BACKEND

We are also working on providing a Docker file that includes a correct installation of NIXL. Should be in #578

[Core] GPU connector performance improvement

Thanks @IRONICBo , looking forward to your contribution!

[WIP] [Core][P/D] CPU connector for PD disagg

I'm traveling these days. Will come back to this PR after this Wednesday.

[WIP] [Core][P/D] CPU connector for PD disagg

Fixed the crash problem. Now the lm_eval runs with the following output on llama-3.1-8B model: ``` |Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr| |-----|------:|----------------|-----:|-----------|---|-----:|---|-----:| |gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.7933|± |0.0234|...

[WIP] [Core][P/D] CPU connector for PD disagg

@njhill @robertgshaw2-redhat Now the crashing & hanging issue should be fixed.