Hannah Sterz comments

Results 8 comments of


                                            Hannah Sterz

How does leave_out parameter rank all layers?

Hi, yes you are right. For a BART model with 12 encoder layers and 12 decoder layers, the encoder layers would have ids 0 to 11 and the decoder layers...

Question regarding training language adapters for unseen languages

For a setup as described in MAD-X you would train a Bottleneck Adapter with an invertible adapter on a language modeling task like MLM on unlabled text. To ensure that...

Add Support for Electra

Hey, thanks for your work on this. We have been working on developing a new `adapters` version of the library which is decoupled from the `transformers` library (see #584 for...

Logits are changing in old adapters-transformer models if used by the new library

Hey @Ch-rode , the inconsistency between adapters trained with the old vs the new library sounds like something we should look into. To reproduce it, can you specify what task...

Adapter for Dinov2 and ViT transformers

Hey @FBehrad , the ViT is supported you can use the ViTAdapterModel which you can load with `from_pretrained ` as you would with `transformers`. The model provides all the adapter...

Better embeddings documentation.

Hey @arsserpentarium , I have created a [notebook](https://colab.research.google.com/drive/1RCNwLH9x8N9yhFuCwmX93SIE59cA9Vby?usp=sharing) illustrating the use of the embeddings functionality. During that, I found a bug with the training addressed in #655 so please install...

Push Fusion weights to adapters hub

Thanks, for your question. Unfortunately, the current implementation does not support pushing and loading fusion layers to and from the hub. I am going to change this to a feature...

AdapterConfig's leave_out not work well in EncoderDecoderModel

Hey @ZeguanXiao, I see why this is unexpected behavior. Unfortunately it is not as easy as changing the `iter_layer` indices. I will look into this.