adapter-bert icon indicating copy to clipboard operation
adapter-bert copied to clipboard

Results 7 adapter-bert issues
Sort by recently updated
recently updated
newest added

Dear authors, After reading the code, I find the default initialization behavior for adaptor parameters (`w1`, `w2`) is initialized with a small standard deviation. Does this guarantee the projection is...

Hi I am trying adapters on Bert-base. I am evaluating on GLUE. On smaller datasets like MRPC, RTE, COLA, I see good results, but on large datasets of GLUE like...

Hi, I am having a model in which normalization first happens and then there is add operation. In the paper, you discussed the post-norm case, could you tell me how...

Hi Could you confirm in the implementation of adapters, if layer_norm of the original model should be unfreezed? or only layer_norm inside adapter needs to be unfreezed? How about the...

Hi, thanks for your great work! I fail to reproduce the high results on the GLUE datasets. Could you provide the hyperparameters w.r.t, training epoch, learning rate..., of the 9...

Congratulations on the great paper! One question, do you have additional processor classes? At the moment, the code reads: `processors = { "cola": ColaProcessor, "mnli": MnliProcessor, "mrpc": MrpcProcessor, }` and...

Thanks for the great work here. I have a question, when I read though the paper, I can understand that fewer parameters training should bring speed benefit, and please correct...