bitsandbytes
bitsandbytes copied to clipboard
Output 0 of MatMul8bitLtBackward is a view and is being modified inplace.
I would like to report a bug using MatMul8bitLt.
Bug description
When I used the following three in my code:
flash_attn_varlen_funcfrom flash_attn (v2.0.8, Github Link),MatMul8bitfrom bitsandbytes (v0.41.1),Linear8bitLtfrom PEFT (v0.3.0dev0, Github Link),
I came across the following error:
RuntimeError: Output 0 of MatMul8bitLtBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning the output of the custom Function.
How to fix it
The error can be fixed** by modifying Line 440 and 441 of bitsandbytes/autograd/_functions.py:
clone_func = torch.clone if len(output_shape) == 3 else lambda x: x
return clone_func(output.view(output_shape))
to
if len(output_shape) == 3:
return output.view(output_shape).clone()
else:
return output
Source of bug
This error happens because:
flash_attn_varlen_funcrequires the q,k,v inputs of self-attention to be 2D of shape (total_tokens, hidden_size);- in case of 2D inputs,
output_shapeis 2 andtorch.cloneis not used; - in Linear8bitLt.forward, the result from
bnb.nn.Linear8bitLtis modified inplace:result += output; - for some reason, the autograd logic does not allow this.
I got this error too when I tried to fine tune a 8bit model with peft
me too!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
same here
I get the same error.
Same error.
Will try to go down the rabbit hole to fix it
The same error occurred for me.
The same error occurred for me.
I provided a fix. Hope it helps.
Hi all,
Does this issue remain with v0.45.0 and the latest PEFT? Happy to revisit it if so!