aitextgen icon indicating copy to clipboard operation
aitextgen copied to clipboard

Continuing training for a pre-trained model doesn't seem to work

Open accountForIssues opened this issue 2 years ago • 1 comments

Maybe I am doing it wrong or understand it differently but I cannot seem to continue training a pre-trained model.

Initial training

ai = aitextgen(tf_gpt2="124M", to_gpu=True)
# or
ai = aitextgen(model="EleutherAI/gpt-neo-125M", to_gpu=True)
ai.train(train_data=train_file, num_steps=100, save_every=100)

Later on

ai = aitextgen(model_folder="trained_model", to_gpu=True)
ai.train(train_data=train_file, num_steps=200, save_every=100)

The training starts from 0 again. Looking at loss values, it's clear that it started from scratch.

Are there any other settings I need to use to continue the training and not start from scratch, something like with overwrite=True, restore_from='latest' in gpt-2-simple.

accountForIssues avatar May 16 '22 11:05 accountForIssues

Try set num_steps and batch_size higher

tientr avatar May 21 '22 03:05 tientr