Minjie Xu issues

Repositories
Issues
Comments

Results 3 issues of


                                            Minjie Xu

Accelerate training convergence

Depends on #4 I found #4 to be kinda sufficient to get aspect 0 training stably (even with batch size 1024), but not so much for aspect 1 and 2....

Separate eval from train for lambda_i and ci_ma

1. only update lambda **during training** 2. use **separate** moving-averages for `train` vs. `eval` (similar to batch-norm I guess?) I find 1 to be **crucial** for stabilizing the training under...

Compare the best model under the same lambdas

Per title - as the augmented Lagrangian is a **minimax** problem (i.e. min w.r.t. model parameters, yet max w.r.t. lambdas), it doesn't really make sense to always prefer the lower...