Adibvafa Fallahpour comments

Results 16 comments of


                                            Adibvafa Fallahpour

Implement MambaForSequenceClassification

@ArthurZucker Now that the https://github.com/huggingface/transformers/pull/32080 is merged, can we do a final review for this one too? Also, I would like to add Mamba2ForSequenceClassification to this PR as well so...

Implement MambaForSequenceClassification

> Hey @Jellymoon, the Mamba model, works as expected during the training loop. However, it fails during the evaluation loop. So, I found that it is necessary to set `use_cache=False`...

Implement MambaForSequenceClassification

> I noticed that the training speed (fine-tuning) is very slow compared to the other HF transformer models. Can something be improved here? Do you have mamba-ssm installed? Is it...

Implement MambaForSequenceClassification

> @Adibvafa, I have `mamba-ssm` installed. However, I realized that it also need `causal-conv1d>=1.4.0` package train faster. Otherwise it was showing some warning related to conv1d that it's gonna use...

[Bug] Adapter merging and saving are corrupted for Qwen3 after training with new tokens

What is the work around for now? @DmitryDiTy

Add RAG support

@lucasgreenwell A PR to add RAG support is added https://github.com/bowang-lab/MedRAX/pull/19 Please take a look!