mamba selective_scan

I'm using the m1 chip version of MacOS and python3.10 pytorch2.2.1 natively tried to use mamba_ssm.ops.selective_scan_interface native, so I tried to skip here, the truth is that it works, and it can also call model.to ("mps") so I made this modification attempt

Mar 24 '24 08:03 wang935415150

This is an interesting discovery. Just curious: is there a significant speedup from mps over cpu?

Mar 24 '24 11:03 radarFudan

You can put the import in the try except, but I wouldn't call the selective_scan_ref function in selective_scan_fn if selective_scan_cuda is not found. Instead it should error. We don't want people to silently get much slower performance if they forgot to install the CUDA extension, or the installation was not correct.

Mar 24 '24 18:03 tridao

This is an interesting discovery. Just curious: is there a significant speedup from mps over cpu?

Hello, my English is not very good, so I took the translation tool and replied: You can check out this official document, in fact he has some improvements. https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/

Mar 25 '24 05:03 wang935415150

I'm using the m1 chip version of MacOS and python3.10 pytorch2.2.1 natively tried to use mamba_ssm.ops.selective_scan_interface native, so I tried to skip here, the truth is that it works, and it can also call model.to ("mps") so I made this modification attempt

Can you give me a brief intro about how it works on mps device ? I would appreciate it if you can contact me

Nov 17 '24 14:11 DowneyFlyfan

selective_scan_cuda error