Mohith Kune

Results 18 comments of Mohith Kune

Hey @Jellymoon, the Mamba model, works as expected during the training loop. However, it fails during the evaluation loop. So, I found that it is necessary to set `use_cache=False` when...

I noticed that the training speed (fine-tuning) is very slow compared to the other HF transformer models. Can something be improved here?

@Adibvafa, I have `mamba-ssm` installed. However, I realized that it also need `causal-conv1d>=1.4.0` package train faster. Otherwise it was showing some warning related to conv1d that it's gonna use slow/sequential...

@Adibvafa, a bug in Mamba? or `transformers`? Can you eloborate? Please share the link of the issue.

I successfully ran the Mamba model with the new changes you made to the code. Any chance that this will also support the Mamba2 model?

I'm looking to transcribe multiple audio files at once with WhisperX - purely batch inference. Can anyone point me in the right direction?