learning-not-to-learn icon indicating copy to clipboard operation
learning-not-to-learn copied to clipboard

How did you choose _lambda = 0.01?

Open ricvolpi opened this issue 5 years ago • 3 comments

Hi - thank you for releasing the code for your paper.

I was playing with your code because my colleagues and I are comparing a method we designed with yours. I was wondering how you came up with the hyper-parameter choice _lambda = 0.01 (trainer.py, line 107). I couldn't find the discussion around the hyper-parameter selection in your paper (my apologies if I have missed it).

Best, Riccardo

ricvolpi avatar Oct 09 '19 13:10 ricvolpi

Basically it's rule of thumb:)

Fortunately, algorithm was not very sensitive to the choice of lambda on our exp.

Very sorry for too late response. Byungju Kim

feidfoe avatar Mar 03 '20 04:03 feidfoe

Hi Byungju,

Many thanks for your reply. What is the rule of thumb though? :)

I have tried several values for lambda, and the model seems actually pretty sensitive to me. Also, I could not understand how published results were obtained, since I could not replicate them by running the released code as is.

Are they all related to the same lambda value? Is any form or early stopping involved?

Below the final outputs with different combinations of benchmark/lambda value/random seed.

Thanks in advance, Riccardo


[unlearn_0.02_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7118 (7118/10000) [unlearn_0.02_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.1333 (1333/10000) [unlearn_0.02_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7336 (7336/10000) <- best [unlearn_0.02_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.5819 (5819/10000) [unlearn_0.02_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.4728 (4728/10000)

[unlearn_0.025_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7555 (7555/10000) [unlearn_0.025_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7069 (7069/10000) [unlearn_0.025_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.6885 (6885/10000) [unlearn_0.025_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8289 (8289/10000) <- best [unlearn_0.025_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.6464 (6464/10000)

[unlearn_0.03_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8516 (8516/10000) <- best [unlearn_0.03_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7695 (7695/10000) [unlearn_0.03_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7982 (7982/10000) [unlearn_0.03_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7383 (7383/10000) [unlearn_0.03_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.6113 (6113/10000)

[unlearn_0.035_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8506 (8506/10000) [unlearn_0.035_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8781 (8781/10000) <- best [unlearn_0.035_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8514 (8514/10000) [unlearn_0.035_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7488 (7488/10000) [unlearn_0.035_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7396 (7396/10000)

[unlearn_0.04_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8655 (8655/10000) [unlearn_0.04_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8747 (8747/10000) <- best [unlearn_0.04_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8689 (8689/10000) [unlearn_0.04_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8130 (8130/10000) [unlearn_0.04_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.7493 (7493/10000)

[unlearn_0.045_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9277 (9277/10000) <- best [unlearn_0.045_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9277 (9277/10000) <- best [unlearn_0.045_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9017 (9017/10000) [unlearn_0.045_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8775 (8775/10000) [unlearn_0.045_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8682 (8682/10000)

[unlearn_0.05_lambda_0.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9219 (9219/10000) [unlearn_0.05_lambda_0.001_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9334 (9334/10000) [unlearn_0.05_lambda_0.01_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.9429 (9429/10000) <- best [unlearn_0.05_lambda_0.1_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8914 (8914/10000) [unlearn_0.05_lambda_1.0_seed_213] INFO: EVALUATION LOSS 0.0000, ACCURACY : 0.8712 (8712/10000)

ricvolpi avatar Mar 03 '20 12:03 ricvolpi

Hi, Riccardo.

Try pre-training. Train your network without h network before the adversarial process.

When we tried this method, the overall performance had been improved a bit. If it does not work, there may be some bugs in the released version of our code.

We'll try to find it. Thanks.

Byungju Kim

feidfoe avatar Mar 10 '20 02:03 feidfoe