mamba
mamba copied to clipboard
I want to ask does anyone know how to solve this problem
I'm curious; I've trained a 20M MAMBA model for molecular generation, and it seems to fair quite badly when trained on small datasets. I added a dropout layer since it...
Thank you for your outstanding work! I'm curious if you've thought about including an additional parameter in the mamba_split_conv1d_scan_combined function to accept an initial_conv_state. This could open up some intriguing...
I did not do bidirectional processing inside mamba2 (the same as Vision Mamba), I did a bidirectional work outside of the class, but the whole model has a large number...
mamba-ssm: 2.0.4 The loss is not nan, but the MambaSplitConv1dScanCombinedFnBackward' returned nan values in its 0th output Thanks
Hello @tridao, first of all, congratulations on the great job you did with Mamba 2. Could you please explain the purpose and operation of the function **_chunk_state_fwd** function and its...
Hello, is it possibile to directly load pretrained weight from mamba1 to mamba2?
import selective_scan_cuda ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory my cuda : 12.1 Is cuda 12.+ dont match mamba2 ?
Thank you for your great work,I found that in Mamba2, chunk_size defaults to 256, while my sequence length is only 200 and still runs normally. In issue #439 , you...
Hello, It appears there may be a typo in line 168 of `mamba2_simple.py`, specifically in the part: `self.conv1d(xBC.transpose(1, 2)).transpose(1, 2)`. To prevent exceptions in the assertion `assert dt.shape == (batch,...