ipex-llm
ipex-llm copied to clipboard
CPU finetune can't work
Hi, I have installed ipex-llm follow the docs: https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/QLoRA-FineTuning
and i meet the error
found intel-openmp in /root/miniconda3/envs/llm/lib/libiomp5.so
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
found tcmalloc in /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/libs/libtcmalloc.so
+++++ Env Variables +++++
Internal:
ENABLE_IOMP = 1
ENABLE_GPU = 0
ENABLE_JEMALLOC = 0
ENABLE_TCMALLOC = 1
LIB_DIR = /root/miniconda3/envs/llm/lib
BIN_DIR = /root/miniconda3/envs/llm/bin
LLM_DIR = /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm
Exported:
LD_PRELOAD = /root/miniconda3/envs/llm/lib/libiomp5.so /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/libs/libtcmalloc.so
OMP_NUM_THREADS = 12
MALLOC_CONF =
USE_XETLA =
ENABLE_SDP_FUSION =
SYCL_CACHE_PERSISTENT =
BIGDL_LLM_XMX_DISABLED =
SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS =
+++++++++++++++++++++++++
Complete.
2024-06-04 20:20:46,549 - WARNING - The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
2024-06-04 20:20:46,975 - INFO - PyTorch version 2.1.2+cpu available.
/root/miniconda3/envs/llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5176/5176 [00:02<00:00, 2348.96 examples/s]
/root/miniconda3/envs/llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.25it/s]
2024-06-04 20:20:55,720 - INFO - Converting the current model to sym_int4 format......
/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/optimization.py:429: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
0%| | 0/200 [00:00<?, ?it/s]Traceback (most recent call last):
File "/hy-tmp/ipex-llm/python/llm/example/CPU/QLoRA-FineTuning/qlora_finetuning_cpu.py", line 120, in <module>
result = trainer.train()
^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 1624, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 2902, in training_step
loss = self.compute_loss(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 2925, in compute_loss
outputs = model(**inputs)
^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/accelerate/utils/operations.py", line 817, in forward
return model_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/accelerate/utils/operations.py", line 805, in __call__
return convert_to_fp32(self.model_forward(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/peft/peft_model.py", line 1129, in forward
return self.base_model(
^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 161, in forward
return self.model.forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1173, in forward
outputs = self.model(
^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1058, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 773, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 698, in forward
attn_output = torch.nn.functional.scaled_dot_product_attention(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected attn_mask dtype to be bool or to match query dtype, but got attn_mask.dtype: float and query.dtype: c10::BFloat16 instead.
0%| | 0/200 [00:00<?, ?it/s]
but when i update pytorch to lastest version (2.3.0) it work, I don't know what happen , Could you take a look ? Thanks !