ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

CPU finetune can't work

Open sanbuphy opened this issue 8 months ago • 4 comments

Hi, I have installed ipex-llm follow the docs: https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/QLoRA-FineTuning

and i meet the error

found intel-openmp in /root/miniconda3/envs/llm/lib/libiomp5.so
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
found tcmalloc in /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/libs/libtcmalloc.so
+++++ Env Variables +++++
Internal:
    ENABLE_IOMP     = 1
    ENABLE_GPU      = 0
    ENABLE_JEMALLOC = 0
    ENABLE_TCMALLOC = 1
    LIB_DIR    = /root/miniconda3/envs/llm/lib
    BIN_DIR    = /root/miniconda3/envs/llm/bin
    LLM_DIR    = /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm

Exported:
    LD_PRELOAD             = /root/miniconda3/envs/llm/lib/libiomp5.so /root/miniconda3/envs/llm/lib/python3.11/site-packages/ipex_llm/libs/libtcmalloc.so
    OMP_NUM_THREADS        = 12
    MALLOC_CONF            = 
    USE_XETLA              = 
    ENABLE_SDP_FUSION      = 
    SYCL_CACHE_PERSISTENT  = 
    BIGDL_LLM_XMX_DISABLED = 
    SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS = 
+++++++++++++++++++++++++
Complete.



2024-06-04 20:20:46,549 - WARNING - The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
2024-06-04 20:20:46,975 - INFO - PyTorch version 2.1.2+cpu available.
/root/miniconda3/envs/llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5176/5176 [00:02<00:00, 2348.96 examples/s]
/root/miniconda3/envs/llm/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 10.25it/s]
2024-06-04 20:20:55,720 - INFO - Converting the current model to sym_int4 format......
/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/optimization.py:429: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
  0%|                                                                                                                                                             | 0/200 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/hy-tmp/ipex-llm/python/llm/example/CPU/QLoRA-FineTuning/qlora_finetuning_cpu.py", line 120, in <module>
    result = trainer.train()
             ^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 1624, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 2902, in training_step
    loss = self.compute_loss(model, inputs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/trainer.py", line 2925, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/accelerate/utils/operations.py", line 817, in forward
    return model_forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/accelerate/utils/operations.py", line 805, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/peft/peft_model.py", line 1129, in forward
    return self.base_model(
           ^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 161, in forward
    return self.model.forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1173, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1058, in forward
    layer_outputs = decoder_layer(
                    ^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 773, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
                                                          ^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/llm/lib/python3.11/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 698, in forward
    attn_output = torch.nn.functional.scaled_dot_product_attention(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected attn_mask dtype to be bool or to match query dtype, but got attn_mask.dtype: float and  query.dtype: c10::BFloat16 instead.
  0%|          | 0/200 [00:00<?, ?it/s]            

but when i update pytorch to lastest version (2.3.0) it work, I don't know what happen , Could you take a look ? Thanks !

sanbuphy avatar Jun 04 '24 14:06 sanbuphy