bitsandbytes AttributeError: 'NoneType' object has no attribute 'device'

Attempting to use Simple LLaMA FineTuner via colab for the first time today, the training seems to work, but when I try to generate a response (after selecting the just-trained model), I just see “Error” in the UI. Over on the colab tab, I see the following output.

local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /usr/lib64-nvidia did not contain libcudart.so as expected! Searching further paths...
  warn(msg)
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')}
  warn(msg)
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-t4-s-16865b17sf5c1 --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true'), PosixPath('--listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https')}
  warn(msg)
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')}
  warn(msg)
/usr/local/lib/python3.9/dist-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//ipykernel.pylab.backend_inline')}
  warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /usr/local/lib/python3.9/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so...
Running on local URL:  http://127.0.0.1:7860/
Running on public URL: https://541be216ae78b16326.gradio.live/

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces
Loading base model...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. 
The class this function is called from is 'LlamaTokenizer'.
Number of samples: 2
Training...
2023-03-31 02:12:15.711305: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/local/lib/python3.9/dist-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
{'train_runtime': 8.0524, 'train_samples_per_second': 0.248, 'train_steps_per_second': 0.248, 'train_loss': 3.214326858520508, 'epoch': 1.0}
Loading base model...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. 
The class this function is called from is 'LlamaTokenizer'.
Number of samples: 2
Training...
{'train_runtime': 4.9098, 'train_samples_per_second': 0.407, 'train_steps_per_second': 0.407, 'train_loss': 3.2097392082214355, 'epoch': 1.0}
Loading base model...
Loading peft model lora-elderberry-fig...
Loading tokenizer...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'. 
The class this function is called from is 'LlamaTokenizer'.
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1108, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 915, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.9/dist-packages/gradio/helpers.py", line 588, in tracked_fn
    response = fn(*args)
  File "/content/simple-llama-finetuner/main.py", line 104, in generate_text
    output = model.generate(  # type: ignore
  File "/usr/local/lib/python3.9/dist-packages/peft/peft_model.py", line 581, in generate
    outputs = self.base_model.generate(**kwargs)
  File "/usr/local/lib/python3.9/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 1485, in generate
    return self.sample(
  File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 2524, in sample
    outputs = self(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 689, in forward
    outputs = self.model(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 577, in forward
    layer_outputs = decoder_layer(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 292, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/llama/modeling_llama.py", line 196, in forward
    query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
  File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/peft/tuners/lora.py", line 565, in forward
    result = super().forward(x)
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/nn/modules.py", line 242, in forward
    out = bnb.matmul(x, self.weight, bias=self.bias, state=self.state)
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/autograd/_functions.py", line 488, in matmul
    return MatMul8bitLt.apply(A, B, out, bias, state)
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/autograd/_functions.py", line 317, in forward
    state.CxB, state.SB = F.transform(state.CB, to_order=formatB)
  File "/usr/local/lib/python3.9/dist-packages/bitsandbytes/functional.py", line 1698, in transform
    prev_device = pre_call(A.device)
AttributeError: 'NoneType' object has no attribute 'device'```

Mar 31 '23 02:03 JoeStrout

I just tried to do it via Spaces (HuggingFace), on GPU-enabled hardware, using the limerick dataset. I get the exact same error, a traceback ending in:

python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 317, in forward ndx69 2023-03-31T06:00:21.324Z state.CxB, state.SB = F.transform(state.CB, to_order=formatB) ndx69 2023-03-31T06:00:21.324Z File "/home/user/.pyenv/versions/3.8.9/lib/python3.8/site-packages/bitsandbytes/functional.py", line 1698, in transform ndx69 2023-03-31T06:00:21.324Z prev_device = pre_call(A.device) ndx69 2023-03-31T06:00:21.324Z AttributeError: 'NoneType' object has no attribute 'device'

Mar 31 '23 06:03 JoeStrout

hi @JoeStrout Can you add device_map={"":0} when calling PeftModel.from_pretrained?

Mar 31 '23 07:03 younesbelkada

hi @JoeStrout Can you add when calling ?device_map={"":0}``PeftModel.from_pretrained

I have encountered the same problem, my version is peft=0.2.0. I wonder if you have resolved this issue?

May 03 '23 12:05 YSLLYW

I just tried to do it via Spaces (HuggingFace), on GPU-enabled hardware, using the limerick dataset. I get the exact same error, a traceback ending in:

python3.8/site-packages/bitsandbytes/autograd/_functions.py", line 317, in forward ndx69 2023-03-31T06:00:21.324Z state.CxB, state.SB = F.transform(state.CB, to_order=formatB) ndx69 2023-03-31T06:00:21.324Z File "/home/user/.pyenv/versions/3.8.9/lib/python3.8/site-packages/bitsandbytes/functional.py", line 1698, in transform ndx69 2023-03-31T06:00:21.324Z prev_device = pre_call(A.device) ndx69 2023-03-31T06:00:21.324Z AttributeError: 'NoneType' object has no attribute 'device'

I have encountered the same problem, my version is peft=0.2.0. I wonder if you have resolved this issue?

May 03 '23 12:05 YSLLYW

Offload weight to CPU before Int8Param, and them load Int8Param to GPU.

May 30 '23 06:05 Tracin

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Dec 20 '23 16:12 github-actions[bot]

No this is still an open issue, I subscribed to it recently...

Dec 26 '23 14:12 lmmx

Also getting this error, are there any updates?

Mar 05 '24 09:03 huylenguyen

Also getting this error, are there any updates?

Mar 28 '24 04:03 lwh8915

Not from me. I gave up on this long ago.

Apr 01 '24 03:04 JoeStrout

bitsandbytes bitsandbytes copied to clipboard

AttributeError: 'NoneType' object has no attribute 'device'

bitsandbytes
bitsandbytes copied to clipboard