intel-npu-llm
intel-npu-llm copied to clipboard
A simple Python script for running LLMs on Intel's Neural Processing Units (NPUs)
trafficstars
Features ๐
- Default models list:
- meta-llama/Meta-Llama-3.1-8B-Instruct
- microsoft/Phi-3-mini-4k-instruct
- Qwen/Qwen2-7B
- mistralai/Mistral-7B-Instruct-v0.2
- openbmb/MiniCPM-1B-sft-bf16
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
- User can input any model they like
- No guarantee that every model will compile for the NPU, though
- here is a list of models likely to run on NPU
- One-Time Setup: The script downloads the model, quantizes it, converts it to OpenVINO IR format, compiles it for the NPU, and caches the result for future use. ๐กโ
- Performance: Surprisingly fast inference speeds, even on devices with modest computational power (e.g., my Meteor Lake's 13 TOPS NPU). โกโณ
- Power Efficiency: While inference might be faster on a CPU or GPU for some devices, the NPU is significantly more energy-efficient, making it ideal for laptops. ๐๐
Screenshot

As you can see, It's using NPU for text generation.
Requirements โ
- Python >= 3.13
- An Intel processor with an NPU:
- Meteor Lake (Core Ultra Series 1, i.e., 1XX chips)
- Arrow Lake (Core Ultra Series 2, i.e., 2XX chips)
- Lunar Lake (Core Ultra Series 2, i.e., 2XX chips)
- Newest Intel NPU driver
Installation ๐
Step 1: Clone the Repository ๐
git clone https://github.com/justADeni/intel-npu-llm.git
cd intel-npu-llm
Step 2: Create a Virtual Environment ๐ข
python -m venv npu_venv
Step 3: Activate the Virtual Environment โ๏ธ
- On Windows:
npu_venv/Scripts/activate - On Linux:
source npu_venv/bin/activate
Step 4: Install Dependencies ๐โ๏ธ
pip install -r requirements.txt
Step 5: Run the Script ๐โก
python intel_npu_llm.py
Notes โน๏ธ
- Resource-Intensive Compilation: The data-aware quantization and compilation steps can be time-consuming, taking up to tens of minutes depending on your hardware. However, these steps are performed only once per model and are cached for future use. โโ๏ธ
Contributing โญ
Contributions, bug reports, and feature requests are welcome! Feel free to open an issue or submit a pull request. ๐จโ๏ธ
License ๐
This project is licensed under the MIT License. ๐โจ
Enjoy using intel-npu-llm ! For any questions or feedback, please reach out or open an issue on GitHub. โจ๐ง