TextAttack-A2T icon indicating copy to clipboard operation
TextAttack-A2T copied to clipboard

Training time issue

Open Han8931 opened this issue 3 years ago • 3 comments

Hi, I am training a RoBERTa model with A2T but it seems it will take really long time. is it normal to take this long?

Han8931 avatar Jan 27 '22 01:01 Han8931

Hi @Han8931. Can you provide more detail as to how you are training your model (e.g. dataset size, attack parameters, etc)?

jinyongyoo avatar Feb 14 '22 16:02 jinyongyoo

I just run a model with the provided configurations (BERT for IMDb). It seems even a single run takes around 8 hours. Actually it took more than two days in total.

Han8931 avatar Feb 24 '22 06:02 Han8931

Can confirm this problem. Tried SNLI with a2t on a 2080Ti (batch size 12), the first clean epoch took 7 hours and the generation with a2t was estimated to take 21 hours. Tried again on a 3090 (batch size 32), the first clean epoch still took 3 hours. Comparably, I wrote a simple bert finetuning script and it took only around 1 hour for a clean epoch on the same 2080Ti.

===

Found the problem. In textattack.Trainer.training_step(), the input texts are forced to pad to the max length of pretrained model, causing the computation of model to be much slower than it needed. Changed padding parameter to True significantly improve the speed. Haven't notice any problem.

Is there any specific demand for padding to max length? I'll bring this issue to textattack for further discussion.

zodiacg avatar May 25 '23 01:05 zodiacg