flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

Make alibi slopes a trainable parameter?

Open penguinshin opened this issue 3 weeks ago • 1 comments

Im trying to supply a trainable tensor to the alibi slopes argument so that I can have trainable relative biases. However when I do this, I see zero gradients still. Is there a way to enable trainable slopes?

penguinshin avatar Oct 31 '25 03:10 penguinshin

No that's not implemented (one would have to change the backward pass code to compute the gradient of the slopes).

tridao avatar Oct 31 '25 13:10 tridao