albert albert large

albert large_v2

Open gogokre opened this issue 4 years ago • 2 comments

If I use the albert base or large_v1 model, learning is good. However, if you use the large_v2 model in the same way, you will not learn. It doesn't even say that there is insufficient memory, but the accuracy does not rise. Why is that?

from transformers import AlbertForSequenceClassification, AdamW model = AlbertForSequenceClassification.from_pretrained(
"albert-large-v2", #albert-large-v1 num_labels = 2, # The number of output labels--2 for binary classification.
output_attentions = False, # Whether the model returns attentions weights. output_hidden_states = False, # Whether the model returns all hidden-states. ) model.cuda()

Apr 06 '20 21:04 gogokre

I've noticed similar behavior between albert_base/1 and albert_base/3. While training NER with albert_base/1 is good (enough), albert_base/3 is terrible! The V1 starts from 86% and goes to 89% but the V3 starts from 56% and stays there.

Apr 07 '20 07:04 maziyarpanahi

Even I am facing same issue. I have used ALBERT-base V2 for my task and I got ~70% DEV accuracy. But when I am trying to train the same model using ALBERT-large V2, the accuracy is always less than 10%. Quite strange!

May 06 '20 11:05 Akshayextreme

albert albert copied to clipboard

albert large_v2

albert
albert copied to clipboard