mamba icon indicating copy to clipboard operation
mamba copied to clipboard

Results 311 mamba issues
Sort by recently updated
recently updated
newest added

Hi, thanks for your work, the ssm architecture is very interesting! I am using some tiny variants of mamba blocks in my work and would appreciate the possibility to have...

I tried to run the command `python benchmarks/benchmark_generation_mamba_simple.py --model-name "state-spaces/mamba-130m" --prompt "My cat wrote all this CUDA code for a new language model and" --topp 0.9 --temperature 0.7 --repetition-penalty 1.2`....

If labels are passed to the model along with the input_ids, it will return the loss. Otherwise it functions the same as it did originally. This allows it to be...

There's no training code included in the repo, so it's hard to tell exactly how the training was done. The paper states: "We use the Pile dataset (L. Gao, Biderman,...

```python ________________________________________________________________________________________________ test_mamba_inner_fn[False-True-128-itype0-wtype0] ________________________________________________________________________________________________ is_variable_B = False, is_variable_C = True, seqlen = 128, itype = torch.float32, wtype = torch.float32 @pytest.mark.parametrize('wtype', [torch.float32, torch.complex64]) # @pytest.mark.parametrize('wtype', [torch.complex64]) # @pytest.mark.parametrize('itype', [torch.float32, torch.float16, torch.bfloat16])...

The save_pretrained() method can and often will fail when training with multi-processes because the save directory can be created by another process after the check of its existence which will...

Your work is outstanding, and I admire the efficiency achieved in your mamba implementation. However, I’m concerned about its accessibility and broader adoption in comparison to transformer-based methods, which are...

Here is the trace. ``` Traceback (most recent call last): File "/home/lukaemon/Dev/lab/mamba/train.py", line 21, in from mamba import MambaLM, MambaConfig File "/home/lukaemon/Dev/lab/mamba/mamba.py", line 11, in from mamba_ssm import Mamba File...

i try to train a simple text generation model. my dataset is ```python texts = [] for a in range(100): for b in range(100): texts.append(f"{a} + {b} = {a+b}") texts.append(f"{a}...