mamba
mamba copied to clipboard
According to most cv paper(like vim), they commonly set the d_state=16. But I wonder if d_state=64 should be attached with the number of tokens
at least including following files: **mamba_ssm-1.2.0.post1+cu118torch2.2cxx11abiTRUE-cp312-cp312-linux_x86_64.whl** **mamba_ssm-1.2.0.post1+cu118torch2.3cxx11abiTRUE-cp312-cp312-linux_x86_64.whl** **mamba_ssm-1.2.0.post1+cu118torch2.3cxx11abiFALSE-cp312-cp312-linux_x86_64.whl**
data:image/s3,"s3://crabby-images/23a5e/23a5e31ca3bfd80ea61a03919ca67c6cde1fb7e9" alt="image" Have some one meet this question,how to deal this problem. Thanks you.
I am getting the following error trying to load a Mamba model: `TypeError: MambaConfig.__init__() got an unexpected keyword argument '_name_or_path'` This is due to the config.json having this as its...
Hello, currently I can get mamba working on 2.2 but that breaks flash-attn, which works on 2.3 but breaks on 2.2. Both have the same error of unrecognized symbol importing...
Once I had my mamba environment set up and mamba-ssm installed successfully, I ran the example code provided by the repository and the output of x and y was as...
I studied the paper of mamba and got this code. but I still do not know how to implement it. Because I cannot understand what mean the variable in the...
I have been successfully run. Environment follows: cuda 11.8 python 3.10.13 pytorch 2.1.1 causal_conv1d 1.1.1 mamba-ssm 1.2.0.post1 ``` pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 pip install causal_conv1d==1.1.1 pip install...
data:image/s3,"s3://crabby-images/aa641/aa6410df72667a9cf2b2789681b4951a336d6ea0" alt="image" "I want to install causal-conv1d, but I encountered the following issue." "How can I ensure that the CUDA version and nvcc version are the same, and which should I...
I try 2 variants: remove causal conv and remove both causal conv and silu, and they both seem to destabilize training and give me NaN. Is it normal?