interpretable_predictions icon indicating copy to clipboard operation
interpretable_predictions copied to clipboard

Separate eval from train for lambda_i and ci_ma

Open chokkyvista opened this issue 3 years ago • 1 comments

  1. only update lambda during training
  2. use separate moving-averages for train vs. eval (similar to batch-norm I guess?)

I find 1 to be crucial for stabilizing the training under GECO - without it I can only train the "latent" model with the default batch size (i.e. 256) regardless of how hard I tune the learning rate. Yet with it fixed, I've managed to train with much larger batch sizes, e.g. 1024.

(edit: correct legend color)

batch_size=256 (default) batch_size=1024 (4x)
image image

chokkyvista avatar Apr 01 '21 11:04 chokkyvista

(no code-change in the above force-push, only switching user.email to match my github account)

chokkyvista avatar Apr 02 '21 08:04 chokkyvista