fast-bert icon indicating copy to clipboard operation
fast-bert copied to clipboard

learner.fit doesn't show result after every epoch

Open miluki01 opened this issue 5 years ago • 7 comments

Hi @kaushaltrivedi , I used:

learner.fit(epochs=6, 
			lr=6e-5, 
			validate=True. 	# Evaluate the model after each epoch
			schedule_type="warmup_cosine") 

However, that code onlys checks after the whole training, not after each epoch. What could I do? Thanks

miluki01 avatar Aug 06 '19 06:08 miluki01

Had the same issue of no metrics being printed at all, seems like it's because the default setting of the logger is to only print warning messages, not info messages. If you are using the root logger (e.g. logger = logging.getLogger()), before you create the object, run this line:

logging.basicConfig(level=logging.NOTSET)

If you are defining a custom logger yourself (e.g. logger = logging.getLogger("my-logger")), before you create the object, run this line:

logging.basicConfig(level=logging.NOTSET)

Now the training process, will print out the loss as well as any other metrics you passed to the learner object.

Alternatively, you can always view the training process either during or afterwards, using tensorboard. The training process creates a folder called tensorboard with all the events files in there.

amin-nejad avatar Aug 14 '19 14:08 amin-nejad

Had the same issue of no metrics being printed at all, seems like it's because the default setting of the logger is to only print warning messages, not info messages. If you are using the root logger (e.g. logger = logging.getLogger()), before you create the object, run this line:

logging.basicConfig(level=logging.NOTSET)

If you are defining a custom logger yourself (e.g. logger = logging.getLogger("my-logger")), before you create the object, run this line:

logging.basicConfig(level=logging.NOTSET)

Now the training process, will print out the loss as well as any other metrics you passed to the learner object.

Alternatively, you can always view the training process either during or afterwards, using tensorboard. The training process creates a folder called tensorboard with all the events files in there.

I have the same issue and this solution didn't solve it for me. Has anyone found anything else that work for you?

vondersam avatar Sep 05 '19 16:09 vondersam

do you get any validation metrics?

kaushaltrivedi avatar Sep 06 '19 22:09 kaushaltrivedi

No, I didn't get any validation metric results, not even after each epoch. This is the code I'm using.

databunch = BertDataBunch(data_dir=BERT_DATA_PATH/fold, 
                                      label_dir=LABEL_PATH, 
                                      tokenizer=args['bert_model'], 
                                      train_file=f'train{is_masked}.csv', 
                                      val_file=f'val{is_masked}.csv',
                                      test_data=None,
                                      text_col="text", 
                                      label_col=labels_index,
                                      batch_size_per_gpu=args['train_batch_size'], 
                                      max_seq_length=args['max_seq_length'], 
                                      multi_gpu=multi_gpu, 
                                      multi_label=True, 
                                      model_type='bert')
learner = BertLearner.from_pretrained_model(databunch, 
                                        pretrained_path=args['bert_model'], 
                                        metrics=metrics, 
                                        device=device, 
                                        logger=logger, 
                                        finetuned_wgts_path=None, 
                                        warmup_steps=500,
                                        output_dir=fold_dir,
                                        is_fp16=args['fp16'],
                                        loss_scale=args['loss_scale'],
                                        multi_gpu=multi_gpu,  
                                        multi_label=True,
                                        logging_steps=50)
learner.fit(args['num_train_epochs'], lr=args['learning_rate'], schedule_type="warmup_linear")
learner.save_model()

And this is what the logs look like image

vondersam avatar Sep 07 '19 09:09 vondersam

I am having the same problem

aaranda7 avatar Oct 19 '19 05:10 aaranda7

How can we print custom metrics that we may want to display along with accuracy?

DecentMakeover avatar Jan 22 '20 07:01 DecentMakeover

I have the same issue I want to plot accuracy and val accuracy any one doit ??

MahdadiChaima avatar Jun 16 '22 03:06 MahdadiChaima