Cheng Jinbao comments

Results 7 comments of


                                            Cheng Jinbao

Absolute fastest inference speed

> I'm still not able to get deepspeed to work. I get this error > > ``` > [8/9] c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I/home/user/miniconda3/envs/tortoise/lib/python3.11/site-packages/deepspeed/ops/csrc/includes...

ModuleNotFoundError: No module named 'df'

Sorry, it seems that I did not install the requirements.txt correctly. Thanks a lot !

Question on quantized LLAMA3 versions for use with EAGLE

Any progress？I want to train a draft model with a quantized target model. Thanks~

Question on quantized LLAMA3 versions for use with EAGLE

> Hi, I'm just wondering if you both had any progress. Thank you in advance! > > [@UltramanKuz](https://github.com/UltramanKuz) [@jin-eld](https://github.com/jin-eld) I also gave up on it

[Performance]: vllm Eagle performance is worse than expected

> I observed 20% lower acceptance length numbers compared to the official EAGLE code using LLaMA3-Instruct 8B as base model and abhigoyal/EAGLE-LLaMA3-Instruct-8B-vllm as draft model. I noticed that the vLLM...

[Performance]: vllm Eagle performance is worse than expected

> I tried the following three approaches based on comments from this issue and [#11126 (comment)](https://github.com/vllm-project/vllm/issues/11126#issuecomment-2552720713), as well as by reviewing the implementation of the EAGLE framework. Thank you all...

[Performance]: vllm Eagle performance is worse than expected

> Hi, [@UltramanKuz](https://github.com/UltramanKuz) Modifications are applied by this PR. [#11672](https://github.com/vllm-project/vllm/pull/11672) I really appreciate your assistance!