Passage
Passage copied to clipboard
Initialize LSTM forget gate with one
This paper found they got significantly better results on a range of tasks if the forget gate for LSTM is initialized with one.
http://www.jmlr.org/proceedings/papers/v37/jozefowicz15.pdf
How to initialze the forget gate with1