getting below error when training the Llama3 8B 4 bit model in unsloth

==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1 \ /| Num examples = 54 | Num Epochs = 10 O^O/ _/ \ Batch size per device = 2 | Gradient Accumulation steps = 4 \ / Total batch size = 8 | Total steps = 60 "-____-" Number of trainable parameters = 41,943,040

NotImplementedError Traceback (most recent call last) in <cell line: 1>() ----> 1 trainer_stats = trainer.train()

35 frames /usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/dispatch.py in _run_priority_list(name, priority_list, inp) 63 for op, not_supported in zip(priority_list, not_supported_reasons): 64 msg += "\n" + _format_not_supported_reasons(op, not_supported) ---> 65 raise NotImplementedError(msg) 66 67

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 50, 8, 4, 128) (torch.bfloat16) key : shape=(2, 50, 8, 4, 128) (torch.bfloat16) value : shape=(2, 50, 8, 4, 128) (torch.bfloat16) attn_bias : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> p : 0.0 [email protected] is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see python -m xformers.info for more info cutlassF is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see python -m xformers.info for more info smallkF is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 xFormers wasn't build with CUDA support dtype=torch.bfloat16 (supported: {torch.float32}) attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> operator wasn't built - see python -m xformers.info for more info operator does not support BMGHK format unsupported embed per head: 128

Using the latest notebook and code %%capture

Installs Unsloth, Xformers (Flash Attention) and all other packages!

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" !pip install --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes

It was working fine yesterday and the same is not working today. Please let me know how to solve this

May 22 '24 16:05 acsankar

fix mentioned in #400 is not working

May 22 '24 16:05 acsankar

Indeed. This worked 2 days ago, but now I'm getting this error. The fix in #400 has already been implemented

May 22 '24 18:05 skerit

Are you all on Colab? Try the new method maybe:

%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes

May 22 '24 19:05 danielhanchen

Indeed, I copied the notebook and it still had the "xformers<0.0.26" dependency

May 22 '24 19:05 skerit

Facing the same issue NotImplementedError: No operator found for memory_efficient_attention_forward , I am in a sagemaker notebook. Used the below method

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers trl peft accelerate bitsandbytes

May 23 '24 01:05 subhamiitk

@skerit Wait so does the new one I posted work? @subhamiitk @acsankar There is another way which might work:

First check if torch.__version__ is v 2.2 or lower. If yes, do:

!pip install -U "xformers<0.0.26" --index-url https://download.pytorch.org/whl/cu121
!pip install "unsloth[kaggle-new] @ git+https://github.com/unslothai/unsloth.git"

If 2.3 or higher: First check if torch.__version__ is v 2.2 or lower. If yes, do:

!pip install -U xformers --index-url https://download.pytorch.org/whl/cu121
!pip install "unsloth[kaggle-new] @ git+https://github.com/unslothai/unsloth.git"

May 23 '24 04:05 danielhanchen

@danielhanchen Yes, just changing the install command from !pip install --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes to !pip install --no-deps xformers trl peft accelerate bitsandbytes did the trick for me in Google Collab.

May 23 '24 08:05 skerit

@danielhanchen - Thanks for the support, it's working now after the change

May 23 '24 08:05 acsankar

working fine after the recommended changes

May 24 '24 01:05 acsankar

Great! :)

May 24 '24 10:05 danielhanchen

unsloth
unsloth copied to clipboard

NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:

==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1 \ /| Num examples = 54 | Num Epochs = 10 O^O/ _/ \ Batch size per device = 2 | Gradient Accumulation steps = 4 \ / Total batch size = 8 | Total steps = 60 "-____-" Number of trainable parameters = 41,943,040

Installs Unsloth, Xformers (Flash Attention) and all other packages!

unsloth unsloth copied to clipboard

NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:

==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1 \ /| Num examples = 54 | Num Epochs = 10 O^O/ _/ \ Batch size per device = 2 | Gradient Accumulation steps = 4 \ / Total batch size = 8 | Total steps = 60 "-____-" Number of trainable parameters = 41,943,040

Installs Unsloth, Xformers (Flash Attention) and all other packages!

unsloth
unsloth copied to clipboard