Dongjie Shi comments

Results 57 comments of


                                            Dongjie Shi

ModuleNotFoundError: No module named 'ipex_llm.vllm.xpu' while using docker and installation

We refactored the vLLM code today, the new code and image are in the CI/CD progress, while the doc is updated first. Please try `python -m ipex_llm.vllm.entrypoints.openai.api_server` for today's ipex-llm...

Slow text generation on dual Arc A770's w/ vLLM

Could you please share the client script/code to send requests to the vLLM API server, so we can use it to get the inference/text generation TPS same as you.

Slow text generation on dual Arc A770's w/ vLLM

> Is that testing the inference performance or the text generation performance? I can't seem to get the vllm_online_benchmark working, just keeps throwing 404 errors on the server as far...

Slow text generation on dual Arc A770's w/ vLLM

Hi @HumerousGorgon, what's the difference between inference and text generation you mentioned here? Can you observe data as below? Or are there any other metrics that help you differentiate between...

Slow text generation on dual Arc A770's w/ vLLM

please also make sure "re-sizeable BAR support" and "above 4G mmio" are enabled.

On A770，vllm and llama.cpp which brings better performance for MiniCPM-v-2.6 ?

> Is there a schedule when we could get a vllm supported version to run MiniCPM-v ？ hi, the upgrade of IPEX-LLM vLLM to 0.5.x is in progress, we will...

On A770，vllm and llama.cpp which brings better performance for MiniCPM-v-2.6 ?

the upgrade of IPEX-LLM vLLM to 0.5.4 is finished, and MiniCPM-V-2.6 is supported, please refer to https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/vLLM-Serving#image-input

models

Sorry, we don't have any plan to support Confyui and flux yet. BTW, phi-3 is supported, please refer to https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HuggingFace/Multimodal/phi-3-vision

Confyui

Sorry, we don't have any plan to support Confyui and flux yet.

Glm4-9b-inference输出错误ISSUE

fixed