bitsandbytes Output 0 of MatMul8bitLtBackward is a view and is being modified inplace.

Output 0 of MatMul8bitLtBackward is a view and is being modified inplace.

Open richardwth opened this issue 2 years ago • 9 comments

I would like to report a bug using MatMul8bitLt.

Bug description

When I used the following three in my code:

flash_attn_varlen_func from flash_attn (v2.0.8, Github Link),
MatMul8bit from bitsandbytes (v0.41.1),
Linear8bitLt from PEFT (v0.3.0dev0, Github Link),

I came across the following error: RuntimeError: Output 0 of MatMul8bitLtBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning the output of the custom Function.

How to fix it

The error can be fixed** by modifying Line 440 and 441 of bitsandbytes/autograd/_functions.py:

clone_func = torch.clone if len(output_shape) == 3 else lambda x: x
return clone_func(output.view(output_shape))

if len(output_shape) == 3:
    return output.view(output_shape).clone()
else:
    return output

Source of bug

This error happens because:

flash_attn_varlen_func requires the q,k,v inputs of self-attention to be 2D of shape (total_tokens, hidden_size);
in case of 2D inputs, output_shape is 2 and torch.clone is not used;
in Linear8bitLt.forward, the result from bnb.nn.Linear8bitLt is modified inplace: result += output;
for some reason, the autograd logic does not allow this.

Aug 27 '23 12:08 richardwth

I got this error too when I tried to fine tune a 8bit model with peft

Oct 05 '23 04:10 luvwinnie

me too!

Dec 12 '23 22:12 harper-carroll

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.