Adibvafa Fallahpour
Adibvafa Fallahpour
@ArthurZucker Now that the https://github.com/huggingface/transformers/pull/32080 is merged, can we do a final review for this one too? Also, I would like to add Mamba2ForSequenceClassification to this PR as well so...
> Hey @Jellymoon, the Mamba model, works as expected during the training loop. However, it fails during the evaluation loop. So, I found that it is necessary to set `use_cache=False`...
> I noticed that the training speed (fine-tuning) is very slow compared to the other HF transformer models. Can something be improved here? Do you have mamba-ssm installed? Is it...
> @Adibvafa, I have `mamba-ssm` installed. However, I realized that it also need `causal-conv1d>=1.4.0` package train faster. Otherwise it was showing some warning related to conv1d that it's gonna use...
What is the work around for now? @DmitryDiTy
@lucasgreenwell A PR to add RAG support is added https://github.com/bowang-lab/MedRAX/pull/19 Please take a look!