ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Ma...

Results 608 ipex-llm issues
Sort by recently updated
recently updated
newest added

## Description ### 1. Why the change? ### 2. User API changes ### 3. Summary of the change ### 4. How to test? - [ ] N/A - [ ]...

As the title says, using starcoder2 times out or appears stuck. Logs attached. [starcoder2_timout.txt](https://github.com/user-attachments/files/15588642/starcoder2_timout.txt)

user issue

hi, I have successfully used the code below to test the speed of generating tokens when using qwen-7b under ipex. `def main(model_dir = "Qwen/Qwen-7B-Chat"): seed = 1024 max_experiment_times = 1...

user issue

Using ipex-llm docker version for inferencing, but during inference time it experiences errors from util files below is the log: ``` ------------------------------------------------------------------------------------------------------------------------ Inferencing ./samples/customer_sku_transformation.txt ... ------------------------------------------------------------------------------------------------------------------------ The installed version of...

user issue

platform:Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz os: Suse 13 model:mistralai/Mistral-7B-Instruct-v0.2 ipex-llm:2.1.0b20240515 transformers: 4.37.0 ldd: 2.22 gcc/g++: 11.1.0 After Loading checkpoint shards 100%, it shows: `Error: Failed to load the...

user issue

vesion: 2.1.0b20240610 error: ipex_llm/transformers/models/chatglm4.py", line 342, in core_attn_forward NameError: name 'math' is not defined

user issue

When I go to use the `generate.py` script, I get the following error: ```bash python ./generate.py --repo-id-or-model-path 'google/codegemma-7b-it' --prompt 'Write a hello world program in Python' --n-predict 32 Traceback (most...

user issue

![微信图片_20240605135354](https://github.com/intel-analytics/ipex-llm/assets/166265863/4bcfc12a-ead8-468a-ab24-dfe60fb1d9d4) The following error occurred after running for a period of time, please refer to the attachment. Currently, no reproduction method has been found. GPU accelerates Ollama to run Qwen1.5...

user issue

Llamaindex-ts example on CPU and Intel GPU * Agent * RAG

Qwen1.5-7B 8K输入下会OOM ,当修改qwen1.5\Lib\site-packages\transformers\models\qwen2\modeling_qwen2.py #logits = logits.float() 可以运行,但是memory降低很多,是否对模型其他方面有影响。 是否能优化这个模型的整体memory消耗.

user issue