Hannah Sterz

Results 8 comments of Hannah Sterz

Hi, yes you are right. For a BART model with 12 encoder layers and 12 decoder layers, the encoder layers would have ids 0 to 11 and the decoder layers...

For a setup as described in MAD-X you would train a Bottleneck Adapter with an invertible adapter on a language modeling task like MLM on unlabled text. To ensure that...

Hey, thanks for your work on this. We have been working on developing a new `adapters` version of the library which is decoupled from the `transformers` library (see #584 for...

Hey @Ch-rode , the inconsistency between adapters trained with the old vs the new library sounds like something we should look into. To reproduce it, can you specify what task...

Hey @FBehrad , the ViT is supported you can use the ViTAdapterModel which you can load with `from_pretrained ` as you would with `transformers`. The model provides all the adapter...

Hey @arsserpentarium , I have created a [notebook](https://colab.research.google.com/drive/1RCNwLH9x8N9yhFuCwmX93SIE59cA9Vby?usp=sharing) illustrating the use of the embeddings functionality. During that, I found a bug with the training addressed in #655 so please install...

Thanks, for your question. Unfortunately, the current implementation does not support pushing and loading fusion layers to and from the hub. I am going to change this to a feature...

Hey @ZeguanXiao, I see why this is unexpected behavior. Unfortunately it is not as easy as changing the `iter_layer` indices. I will look into this.