Tri Dao
Tri Dao
Can you try reducing d_state (e.g.
Thanks i've just fixed it
it's compiling
thanks i've updated
Try making the dimension larger (e.g. multiple of 512).
Idk i haven't tested anything not multiple of 512. You can also not use the conv1d package (uninstalling it should mean the mamba code uses torch nn.Conv1d).
The current backward pass is not deterministic (it uses atomic adds).
I think that's the pytorch cumsum, so it's out of the scope of this repo.
Please update causal_conv1d.
Triton doens't support V100 very well