mamba
mamba copied to clipboard
Hello, I would like to perform single-step inference using Mamba, which means my inference task only needs to generate one token (or extract the last token embedding, without needing intermediate...
Please, please, consider adding the ssm_state input parameter for selective_scan_fn to allow hidden state initialisation for the Mamba block. Also please consider making hidden state differentiable as currently at selective_scan_fn...
Hello, thanks for sharing your great work! I meet some problems when trying to understand the source code in 'selective_scan_interface.py'. I wonder what is the difference between 'SelectiveScanFn' and 'MambaInnerFn'?...
First of all, it is excellent work. I want to do simple classification with the MNIST dataset, I tried many things but could not compile the model. How should I...
Thanks for your work! I wonder if this code will run on windows?
Hi, I'm running the mamba test_selective_scan.py benchmark with increasing the model dimension and the tests starts to fail. Here is how I increase the dimension: ``` diff --git a/tests/ops/test_selective_scan.py b/tests/ops/test_selective_scan.py...
I have tried to replace the self attention layer with mamba and hyena, but witness a worse performance for mamba. I am not sure whether it's because of my mal-setting...
I'm using the m1 chip version of MacOS and python3.10 pytorch2.2.1 natively tried to use mamba_ssm.ops.selective_scan_interface native, so I tried to skip here, the truth is that it works, and...
Dears Authors! Thank you for your great work! I have a question about adapt from cross attention to cross mamba. Can i modify mamba from this  to this (with...