multifit
multifit copied to clipboard
OOM during finetuning
Hi,
Thank you for sharing your repo.
I am trying to finetune a LM with multifit on custom dataset and then finetune the classifier for prediction. Unfortunately I got an OOM after few steps with multifit during the training of the CLS.
I tried to first train the LM then close the session to clean the gpu memory and then train the classifier (loading the encoder weights if I am not wrong in my code) but it does not help. I can not use the same batch size. Is it normal or am I doing something wrong ?
PS : bs = 256
`---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
9 frames
/usr/local/lib/python3.6/dist-packages/fastai/text/learner.py in arrs
along the batch dimension."
--> 255 return [torch.cat([l[si] for l in arrs], dim=1) for si in range_of(arrs[0])]
256
257 def reset(self):
RuntimeError: CUDA out of memory. Tried to allocate 1.02 GiB (GPU 0; 15.90 GiB total capacity; 12.72 GiB already allocated; 599.88 MiB free; 14.61 GiB reserved in total by PyTorch)`
My piece of code :
# pretrained LM
if pretrained_lm:
data_lm_fwd = (TextList.from_df(lm_tr.iloc[:10000], path, cols='comment_text', **fa_config)
.split_by_rand_pct(0.05, seed=42)
.label_for_lm()
.databunch(bs=bs, num_workers=4))
data_lm_fwd.save("fr_data_lm_forward")
if pretrained_lm:
learn_fwd = exp.finetune_lm.get_learner(data_lm_fwd)
learn_fwd.model.cuda()
learn_fwd.lr_find()
learn_fwd.recorder.plot()
# learn is a preconfigured fastai learner with a pretrained model loaded
if pretrained_lm:
learn_fwd.fit_one_cycle(2)
learn_fwd.unfreeze()
for i in range(5):
learn_fwd.fit_one_cycle(2)
learn_fwd.save_encoder("encoder_lm_fr_fwd")
# cls
if pretrained_cls:
data_cls = (TextList.from_df(tr1, path, cols="comment_text", **fa_config)
.split_from_df(col="val")
.label_from_df(cols="toxic")
.databunch(bs=64, num_workers=2))
if pretrained_cls:
learn_cls_fwd = exp.classifier.get_learner(data_cls)#, metrics=[AUROC])
learn_cls_fwd.load_encoder("encoder_lm_fr_fwd")
learn_cls_fwd.freeze()
learn_cls_fwd.fit_one_cycle(3)
learn_cls_fwd.save("multifit_cls_pretrained_fr")