Gary Wang issues

Results 4 issues of


                                            Gary Wang

llama3-8B causes MTL iGPU runtime error when ipex-llm's running AI inference

Hello ipex-llm experts, I suffers issue about Llama-3-8B on MTL-H's iGPU and need any advice from you. :) It seems to have issue with iGPU in MTL 155H but no...

user issue

IPEX-LLM with Langchain-chatchat runs into httpcore.RemoteProtocolError in MTL with iGPU

Hello Sir, I use langchain-chatchat via iGPU for chatglm3-6b LLM running in my MTL 155H and it is suffering issue. ``` 2024-06-07 13:50:11,037 - utils.py[line:38] - ERROR: peer closed connection...

user issue

Slow AI inference speed for first marker prediction on PTL484 32GB sku

**Describe the bug** Observed slow of Time to First Token (about first token cost -->154.4526 s) when AI inference via iGPU/Xe3 of PTL484 platform **How to reproduce** Run IPEX-LLM's sample...

user issue

[ARL] IPEX-LLM's perf drop with GenAI inference on Ollama compared to llama.cpp

Observed 10~15% perf drop on Ollama+IPEX-LLM compared to llama.cpp. Known indeed existing overhead from add-on Ollama framework, but still hope upstream could help clarify if there is any misunderstanding. -...

user issue