mamba
mamba copied to clipboard
Hello, researchers, I'm very interested in your work and use mamba to build some model, and I want to compute the MACs of the model with torchstat, then this error...
I am interested in “How to use mamba to generate audio”. One of amazing things is the long sequence attention, i wanna know whether mamba can be used in TTS,...
Hello, thanks for the nice work. Can we use the Mamba block without this convolutional layer? I would appreciate any hint or suggestion you can give me. Thank You data:image/s3,"s3://crabby-images/48e87/48e8715ce82d7487357ba00a8eb2fb832dfd4ec6" alt="image"
My input is a batch of 200 and values of 100. (torch.Size([200, 100])) So my feature is just a 1-D vector. In the example code, it says you set batch,...
Dear authors, This is an amazing work! I'm working with variable sequence lengths of video data. In one batch, there could be several videos with different frame numbers, and they...
@tridao , @albertfgu 1) Could you please post exact settings that created 1.3B Transformer that is used in Figure 4? 2) Could share what settings you used (hiddendim,..etc) to create...
I want to use nn.Conv1d instead of causal-conv1d. How should I modify the code?
Dear Authors, Thanks for your brilliant works! Now I am learning about the detailed change of parameter shape in your code and in your paper. I noticed that the A...
Dear author, I stacked multiple mamba layers to form a model, and trained the model from scatch. When I just stacked 4 layers, the perfomance was very good. So I...
I am trying to write a simple mamba model by studying the first mamba paper just for testing on a small dataset in JAX. Some tests I came across online...