ipex-llm
ipex-llm copied to clipboard
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Ma...
Can ipex-llm-0.43.1 run on Centos7.9? I encountered an error:Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.Error: Failed to load the...
we are trying to finetune chatGLM6B using LoRA on arcA770 1card and 2cards , use the following command 1card: ``` python ./alpaca_lora_finetuning.py \ --base_model "/home/intel/models/chatglm3-6b" \ --data_path "yahma/alpaca-cleaned" \ --lora_target_modules...
Running vllm according to instructions. Docker segfaults at startup, so I'm running straight on the machine. Starting server with the following shell script. As you can see I've tried to...
Log output: ``` (llm-cpp) D:\Users\Documents\Projects\llama-cpp>server.exe -m "Qwen1.5-MoE-A2.7B-Chat.Q4_K_M.gguf" -ngl 999 {"tid":"12472","timestamp":1719324322,"level":"INFO","function":"main","line":2943,"msg":"build info","build":1,"commit":"adbd0dc"} {"tid":"12472","timestamp":1719324322,"level":"INFO","function":"main","line":2950,"msg":"system info","n_threads":11,"n_threads_batch":-1,"total_threads":22,"system_info":"AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI =...
**microsoft/Florence-2-large** Model to run on Arc 770 GPU for Object Detection on sample image. [https://huggingface.co/microsoft/Florence-2-large](https://huggingface.co/microsoft/Florence-2-large) ``` (llm_vision) spandey2@imu-nex-nuc13x2-arc770-dut:~/LLM_Computer_Vision$ pip list | grep torch intel-extension-for-pytorch 2.1.30+xpu torch 2.1.0.post2+cxx11.abi torchaudio 2.1.0.post2+cxx11.abi torchvision...
Trying to do inference on arc GPU machine, have followed this guidelines: ``` https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Pipeline-Parallel-Inference and run_mistral_arc_2_card.sh ``` ``` (llm) :~/xxx/ipex-llm/python/llm/example/GPU/Pipeline-Parallel-Inference$ bash run_llama_arc_2_card.sh :: WARNING: setvars.sh has already been run. Skipping...
I tried to leverage benchmark tool to test this mutimodal model, meet below error. Model link: https://huggingface.co/openbmb/MiniCPM-V-2 Tool link: https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/dev/benchmark/all-in-one ![image](https://github.com/intel-analytics/ipex-llm/assets/33850226/ed06104d-859d-41a9-8e27-5dc6f790d6f1)
Hi Team, I'd like to file this issue, hopefully, you guys can investigate this issue from top to bottom. My environment is as below, Ubuntu 22.04.4 LTS on MTL with...
Every time when I run the test, it will load the original model and covert to lower bit. If we load a 34B model on 4 ARC card, it will...
![image](https://github.com/intel-analytics/ipex-llm/assets/78522263/c2a3da77-198c-4d5e-84fc-2e5915ee73f1) wonder how should i install this bigdl-chronos[pytorch] to avoid this error?