annotated-mamba
annotated-mamba copied to clipboard
associative ssm op order
i might be wrong, but in the colab/blog, i think the $\oplus$ op used to do the associative scan for the selective state space model should have, as the value of its first output, $a_2 a_1$ (rather than $a_1 a_2$), reflecting the fact that the leftmost $A$ transform gets applied first.
(Since the Mamba matrices are diagonal and therefore commutative it doesn't actually matter here I guess, I just found this initially confusing in the presentation).
It looks to be correct in the triton first_order_op
but is I think reversed in the reference pytorch op
and latex above it.
Thanks for this terrific writeup! It really clarified some things for me, thanks