BLINK
BLINK copied to clipboard
Bug report regarding continue training from an epoch
We managed to train elq on our own dataset. When we tried to continue training from a certain epoch with the same training data (to save time), the model seemed to stop advancing. The loss drops at a very small rate, and the p-r-f scores stops changing.
The only change we did to the code and scripts is assigned ${12}(epoch) for train_elq.sh.
I thought the model should proceed as if it was never stopped, or at least continue advancing not as well for a learning rate change. But it stopped completely. There must be some bug, probably with learning rate here.