dec
dec copied to clipboard
[Q] What does lambda (annealing speed) do?
Hello, great work!
In the paper and the code, you refer to an "annealing speed" lambda ranging over 10*(2^i) for i = 0, 1, ..., 8.
What does this refer to? Is it a learning rate annealer? Do you mean this is how often you reduce the learning rate?
I tried to read the code but could not figure out what seek
does (the lambda parameter is used to update this data seek thing).
Thank you very much!