githubnemo
githubnemo
Can you perhaps supply a small reproducer? To clarify, the `qkv` module wouldn't receive gradients as it is not trained, the adapter is trained. Can you confirm that the adapter...
This is indeed a problem with transformers, thanks @BenjaminBossan for narrowing it down. The problem is with `model.enable_input_require_grads()` - it doesn't seem to support visual language models yet and since...
> > I'm wondering whether transformers just take the input of LLM as input data and ignore the input data in ViT. If so, it's a potential bug for fine-tuning...
This was adressed in https://github.com/huggingface/peft/pull/2568 but wasn't completed. To [quote](https://github.com/huggingface/peft/pull/2568#issuecomment-3406489263) @not-lain: > imo the only good thing to retain from this pr [#2568] is the config file, other scripts are...
*not stale*
> > This is actually what got me wondering. I expected the same but in both cases the "average forgetting" numbers were worse for OFT and better for Full FT....