albert Fine-tuning Albert large

I fine-tuned Albert base on my task but didn't get desired accuracy. Now that I am trying to fine-tune Albert large I get this error: "Resource exhausted: OOM when allocating tensor with shape[8,512,16,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc"

I used single GPU, with 12 GB Memory and also 16 GB (two different attempts). It is interesting that I can fine-tune Bert base on the single gpu with 12GB memory.

Nov 10 '19 19:11 SohaKhazaeli

if i am not get wrong, albert large require big memory than bert base, it could be like requirement for bert large.

Nov 12 '19 06:11 brightmart

Thank you brightmart! I experienced the same thing, but even bert xlarge has a model size smaller than Bert base. It seems computationally it is more expensive but the final model size is smaller.

Nov 12 '19 15:11 SohaKhazaeli

It is relatively computationally more expensive because it has relatively bigger architecture (number of transformer layers, hidden size etc.), therefore more computation. It has less parameters because it does parameters sharing and also decoupled the embedding size and hidden size thus reduced the embedding size tremendously and consequently reducing the parameters/model size.

Nov 12 '19 22:11 riturajkunwar

If you use version 2, the memory consumption should be much smaller and you may be able to finetune the models.

Nov 15 '19 09:11 lanzhzh

I was using version 2, but I haven't succeeded to fine-tune on Albert large.

Nov 15 '19 13:11 SohaKhazaeli

Not sure why but it seems these models are tagged as not fine-tunable

Nov 24 '19 15:11 dhruvsakalley

@dhruvsakalley that was due to a bug in the TF-Hub UI, which should be resolved now. Also, please switch to the "/3" path for TF-Hub modules (see Jan 7 update in readme for details).

Jan 08 '20 22:01 0x0539

I managed to finetune albert_large and got better result than albert_base. However, the xlarge yields unreasonable worse result. I doubt if it's due to the hyperparameter or the way I loaded the weights.

Jan 23 '20 19:01 hankcs

albert albert copied to clipboard

Fine-tuning Albert large

albert
albert copied to clipboard