ConvLab-3 Question on T5 NLG

I have a question regarding the training loss for T5 NLG. If we do not set 'metric_for_best_model' to 'bleu,' as shown in the picture below, is it automatically set to 'loss'? What is the best practice for training T5 NLG?

Screen Shot 1445-07-10 at 11 47 57 AM

Jan 22 '24 08:01 AtheerAlgherairy

Yes, the default metric is loss. According to some NLG studies/practices, continually training the model after achieving the lowest validation loss can still improve metrics like BLEU.

Jan 24 '24 04:01 zqwerty

Thanks..

Jan 24 '24 10:01 AtheerAlgherairy

Hi.. I used T5-base for NLG.. I got the following results..

Screen Shot 1445-08-08 at 2 02 15 PM

Screen Shot 1445-08-08 at 2 03 08 PM

However, I got 'err': 0.5966753105391215 . Any idea how to improve the "err"?

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10.0

Framework versions

Transformers 4.24.0
Pytorch 2.0.1+cu118
Datasets 2.7.1
Tokenizers 0.13.2

Feb 18 '24 11:02 AtheerAlgherairy

Sorry for the late reply. Does "err" mean slot error rate? Maybe you could try some pre-training?

Mar 11 '24 02:03 zqwerty