adapter-bert Adapters on large-datasets in GLUE could not get the same results

Adapters on large-datasets in GLUE could not get the same results

Open dorost1234 opened this issue 3 years ago • 1 comments

Hi I am trying adapters on Bert-base. I am evaluating on GLUE. On smaller datasets like MRPC, RTE, COLA, I see good results, but on large datasets of GLUE like MNLI, QNLI, SST2 I am really struggling and this is getting very below BERT-base.

I have a deadline soon and need to compare fairly with your method, and very much appreciate your feedback on this. Any suggestions which can help the results on large-scale datasets?

thanks

May 09 '21 08:05 dorost1234

What hyperparameters are you using? Did you follow the sweep in the paper?

"We sweep learning rates in {3 · 10−5, 3 · 10−4, 3 · 10−3}, and number of epochs in {3, 20}"

May 10 '21 07:05 neilhoulsby

adapter-bert adapter-bert copied to clipboard

Adapters on large-datasets in GLUE could not get the same results

adapter-bert
adapter-bert copied to clipboard