webppl
webppl copied to clipboard
Default params for optimize
seem like adam with step 0.01 is better than the current default in most of the little models i've tried. change?
also, change default number of forward samples to 100 (or something greater than 1)?
Regarding the stepSize - I also got significant improvement with 0.01, especially for dream..
I've also noticed that small models typically benefit from a larger step size, so if that's the use case we're optimizing for, increasing it seems reasonable to me. Some history.
BTW, we currently inherit the default stepSize
from adnn.
wait -- is the default adam with 0.001? (i thought default was sgd.) if so then maybe i'm fine as is...
What about the default number of forward samples? is 1 good?
right, that was confusing. should be 100 probably.
@ngoodman is there a reason not to change the default from 0.001 to 0.01? In all cases I checked 0.01 was better. @null-a changed the default some time ago from 0.1 to 0.001 (dritchie/adnn@4f4dba2) - I guess since indeed 0.1 is unstable. But 0.001 seems to be too low..