Tri Dao comments

Results 446 comments of


                                            Tri Dao

trafficstars

triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 254208, Hardware limit: 101376.

Can you try reducing d_state (e.g.

Error when trying to use Mamba2

Thanks i've just fixed it

Error when trying to use Mamba2

it's compiling

Error when trying to use Mamba2

thanks i've updated

Error when trying to use Mamba2

Try making the dimension larger (e.g. multiple of 512).

Error when trying to use Mamba2

Idk i haven't tested anything not multiple of 512. You can also not use the conv1d package (uninstalling it should mean the mamba code uses torch nn.Conv1d).

The Reproducibility of Mamba

The current backward pass is not deterministic (it uses atomic adds).

The Reproducibility of Mamba

I think that's the pytorch cumsum, so it's out of the scope of this repo.

Error with Mamba2

Please update causal_conv1d.

Error when using FP16 or Mixed precision

Triton doens't support V100 very well