TextAttack-A2T
TextAttack-A2T copied to clipboard
Training time issue
Hi, I am training a RoBERTa model with A2T but it seems it will take really long time. is it normal to take this long?
Hi @Han8931. Can you provide more detail as to how you are training your model (e.g. dataset size, attack parameters, etc)?
I just run a model with the provided configurations (BERT for IMDb). It seems even a single run takes around 8 hours. Actually it took more than two days in total.
Can confirm this problem. Tried SNLI with a2t on a 2080Ti (batch size 12), the first clean epoch took 7 hours and the generation with a2t was estimated to take 21 hours. Tried again on a 3090 (batch size 32), the first clean epoch still took 3 hours. Comparably, I wrote a simple bert finetuning script and it took only around 1 hour for a clean epoch on the same 2080Ti.
===
Found the problem. In textattack.Trainer.training_step()
, the input texts are forced to pad to the max length of pretrained model, causing the computation of model to be much slower than it needed. Changed padding
parameter to True
significantly improve the speed. Haven't notice any problem.
Is there any specific demand for padding to max length? I'll bring this issue to textattack
for further discussion.