awd-lstm-lm
awd-lstm-lm copied to clipboard
What does Finetune do?
The finetune.py
file looks to be the same as the main.py
file. The paper does not cover the techniques used in the fine tune stage. What are the important techniques that are able to increase performance?
Hi. Did you get finetune.py
functionality?
I'm not sure. This was a long time ago!
If someone still wondering, as far as I understood from the paper and the code, it starts training with ASGD directly with setting the last saved weights as the starting points and starts averaging from T=0. It stops when the val loss or ppl stops improving for a number of epochs (I think last 5 or so) with the same trigger condition that was used to switch during ASGD during normal training. This was mentioned in the paper (Regularizing and Optimizing LSTM Language Models) at section 5. I think it is not different from training, except that we average from the point of fine-tuning only.