ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Ma...

Results 608 ipex-llm issues
Sort by recently updated
recently updated
newest added

https://github.com/THUDM/GLM-4 https://huggingface.co/THUDM/glm-4-9b

user issue

## Description Add qwen2 support for Pipeline-Parallel-FastAPI example.

## Description ### 1. Why the change? #11167 ### 2. User API changes ### 3. Summary of the change ### 4. How to test? - [ ] N/A - [...

batch 1, and 1024-512, it hung as below: THE MYSTERY OF THE CITY](9781441125608_epub_itb-ch5.xhtml) The man's journey took him to the heart of the city, where he discovered a hidden underground...

user issue

I'm running llama3 inference on a MTL Core Ultra 7 1003H iGPU on Ubuntu 2204. This link https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama3 is followed and generate.py is used. The complete script is: source /opt/intel/oneapi/setvars.sh...

user issue

Team, Currently we using ubuntu server 22.04 and kernel is 5.15. Can provide which OneAPI version and GPU driver version work with latest IPEX framework? Thanks!

user issue

## Description Add Pipeline Parallel FastAPI Example QuickStart.