crnn-lid icon indicating copy to clipboard operation
crnn-lid copied to clipboard

Early Stopping occurs during training

Open Arafat4341 opened this issue 5 years ago • 11 comments

I am trying to train model. But While training, all on a sudden Early Stopping occurs. The model is supposed to be trained upto 50 epochs. But at 15-16 epochs it stops.

Can anyone tell why this early stopping occurs?

Arafat4341 avatar Jul 29 '20 03:07 Arafat4341

What do you mean with it stops? Is the accuracy not increasing anymore? Is the script crashing? ...?

Bartzi avatar Jul 29 '20 08:07 Bartzi

@Bartzi Yeah The validation accuracy is not increasing for last 9 epochs.

Arafat4341 avatar Jul 29 '20 08:07 Arafat4341

Well, in order to help. I would need more information. What are you training on? Your own data? What is the current validation accuracy? Did you change anything in the code? What is your train configuration?

Bartzi avatar Jul 29 '20 08:07 Bartzi

Yes I am training on my own Data. Dataset contains English and Japanese audios. Current validation accuracy 0.84.

Training configuration:

batch_size: 64
learning_rate: 0.001
num_epochs: 50

data_loader: "ImageLoader"
color_mode: "L"  # L = bw or RGB
input_shape: [129, 500, 1]

model: "topcoder_crnn_finetune" # _finetune"

segment_length: 10  # number of seconds each spectogram represents
pixel_per_second: 50

label_names: ["EN", "JP"]
num_classes: 2

Arafat4341 avatar Jul 29 '20 08:07 Arafat4341

It could be that your model converges, or that your learning rate is too high at this point. You could add a callback that scales the learning rate of the Adam optimizer, maybe that helps.

Bartzi avatar Aug 03 '20 08:08 Bartzi

@Bartzi Thanks a lot for your suggestion! Would you kindly tell me how can I modify callback object in training script to scale the learning rate?!

Arafat4341 avatar Aug 04 '20 00:08 Arafat4341

The best tip is to have a look at the documentation of Keras :wink: This page, for instance, could be helpful: https://faroit.com/keras-docs/1.2.2/callbacks/

Bartzi avatar Aug 04 '20 08:08 Bartzi

@Bartzi Thanks a lot!

I found : ReduceLROnPlateau. We can add this to update the learning rate if the accuracy doesn't improve after certain epochs. Currently the learning rate we are using is 0.001. Can you suggest me the lower bound for lr ? Actually I cannot guess how less will be too less!

Arafat4341 avatar Aug 06 '20 08:08 Arafat4341

As with most things in Deep Learning: You have to try. But my first idea would be to set it to 0.0001 and see what happens. Going any lower than 1e-6 as starting learning rate is not a good idea, so that would be too low.

Bartzi avatar Aug 06 '20 10:08 Bartzi

@Bartzi Thanks a lot. The model still converges. Stops at epoch 23. No improvement of validation accuracy for last 20 epochs.

early_stopping_callback = EarlyStopping(monitor='val_loss', min_delta=0, patience=15, verbose=1, mode="min")
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.0001)

I collected the data from youtube. The problem is My dataset is not so big. Only 3328 mel-spec images for training set. Problem occurred while downloading data. Youtube_dl sent too many requests error. So I couldn't download as much data as I wanted. Can it be a problem with my data? Am I trying to train on really poor data?!

Arafat4341 avatar Aug 07 '20 06:08 Arafat4341

Yes, that low amount of training data might very well be a problem. Maybe you can try to gather more data, I think this should help with your results.

Bartzi avatar Aug 11 '20 10:08 Bartzi