Associative_LSTM icon indicating copy to clipboard operation
Associative_LSTM copied to clipboard

Convergence rate?

Open juesato opened this issue 8 years ago • 0 comments

Hi,

Thank you for releasing your implementation! I'm working on testing the model in Torch, and am having difficulty replicating results from the original paper, so I wanted to compare to your implementation to make sure there isn't some bug in my implementation (I see much better results from a naive LSTM, even without forget gate initialization, than reported than in the paper, and the Associative LSTM seems to have comparable performance per number of updates).

I have a few questions:

  1. I'm having trouble getting the loss down to .03 as in the figure in the readme (copy-pasted my log file below, and was just wondering if this look usual / how long it should take for the loss to reach 0.03).

  2. Were there any implementation tricks you found important for getting the model to train well? I've zero'd the h to u connections as in the paper, and a single copy of the HRR memory. I looked at your code and didn't see anything obviously different, but it's possible I'm missing something, and I'm wondering if any obvious possibilities stand out.

Thank you

Logfile:

iterations_done:0

iterations_done:1000
train_CE:0.370743

iterations_done:2000
train_CE:0.174449

iterations_done:3000
train_CE:0.168744

iterations_done:4000
train_CE:0.156446

iterations_done:5000
train_CE:0.141770

iterations_done:6000
train_CE:0.126022

iterations_done:7000
train_CE:0.116777

iterations_done:8000
train_CE:0.108515

iterations_done:9000
train_CE:0.100494

iterations_done:10000
train_CE:0.094278

iterations_done:11000
train_CE:0.089932

iterations_done:12000
train_CE:0.083656

iterations_done:13000
train_CE:0.079061

iterations_done:14000
train_CE:0.071482

iterations_done:15000
train_CE:0.175424

iterations_done:16000
train_CE:0.175366

iterations_done:17000
train_CE:0.174720

iterations_done:18000
train_CE:0.174372

iterations_done:19000
train_CE:0.178491

iterations_done:20000
train_CE:0.173044

juesato avatar Dec 20 '16 22:12 juesato