mamba
mamba copied to clipboard
Hi, I wonder if mamba can replace the MLP layers for tabular data analysis. Can I directly use reshape to increase the dimensions of input data? Thanks.
Adding to the model.forward inputs_embeds (same name as HF models use) to give the model token embedding directly rather than token ids The main use case is training soft prompts...
Just a point of curiosity really. I noticed in your code that either ReLU or Swish is used. I understand the choice of including ReLU as it is commonly accepted...
Right now there are no tests / doc on how to continue generation with a given state. I think I figured [it out](https://gist.github.com/Maykeye/3b4fb40ec24943a1255d8665041c8380): at least generating by parts and by...
Hello, Thanks for your interesting work but I have a question about the code that I'd like to discuss with you. Despite fixing all the random seeds, I'm still observing...
I just sort of guessed my way to this use case and want confirmation if I'm using it correctly or how it's actually intended to be used. https://colab.research.google.com/drive/1IZvH_4h7JI-vjN0ruS1cg3CnUB7OIRzE?usp=sharing Also is...
Hi, I am running this repo, but I get the following error. Does anyone know how to fix this? And what is the meaning of 'min_p'? Thanks!: (base) seelur@seelur-B560M-HDV-A-R2-0:~/git/mamba$ python...
https://github.com/state-spaces/mamba/blob/009bec5ee37f586844a3fc89c040a9c1a9d8badf/mamba_ssm/ops/selective_scan_interface.py#L321 should be: ```python x = causal_conv1d_fn(x, rearrange(conv1d_weight, "d 1 w -> d w"), conv1d_bias, activation="silu") ```
To fix : ``` trainer.train() File "/home/julio/anaconda3/envs/mamba/lib/python3.10/site-packages/transformers/trainer.py", line 1555, in train return inner_training_loop( File "/home/julio/anaconda3/envs/mamba/lib/python3.10/site-packages/transformers/trainer.py", line 1789, in _inner_training_loop self.control = self.callback_handler.on_train_begin(args, self.state, self.control) File "/home/julio/anaconda3/envs/mamba/lib/python3.10/site-packages/transformers/trainer_callback.py", line 363, in on_train_begin...