MART
MART copied to clipboard
Log gain of all examples instead of unsuccessful examples.
What does this PR do?
This PR makes Adversary
log gain of all examples, instead of gain of unsuccessful examples.
We should see gain
increases on progress bar if the attack works.
Type of change
Please check all relevant options.
- [ ] Improvement (non-breaking)
- [x] Bug fix (non-breaking)
- [ ] New feature (non-breaking)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] This change requires a documentation update
Testing
Please describe the tests that you ran to verify your changes. Consider listing any relevant details of your test configuration.
- [x]
pytest
- [x]
CUDA_VISIBLE_DEVICES=0 python -m mart experiment=CIFAR10_CNN_Adv trainer=gpu trainer.precision=16
reports 70% (21 sec/epoch). - [x]
CUDA_VISIBLE_DEVICES=0,1 python -m mart experiment=CIFAR10_CNN_Adv trainer=ddp trainer.precision=16 trainer.devices=2 model.optimizer.lr=0.2 trainer.max_steps=2925 datamodule.ims_per_batch=256 datamodule.world_size=2
reports 70% (14 sec/epoch).
Before submitting
- [x] The title is self-explanatory and the description concisely explains the PR
- [x] My PR does only one thing, instead of bundling different changes together
- [ ] I list all the breaking changes introduced by this pull request
- [x] I have commented my code
- [ ] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] I have run pre-commit hooks with
pre-commit run -a
command without errors
Did you have fun?
Make sure you had fun coding 🙃
Why is this necessary? Usually you want to log the actual loss you compute gradients of?
The trend could be confusing in the past. While we try to maximize gain, the number is progress bar goes down because it excludes successful examples gradually.
Why is this necessary? Usually you want to log the actual loss you compute gradients of?
The trend could be confusing in the past. While we try to maximize gain, the number is progress bar goes down because it excludes successful examples gradually.
What is confusing about the trend? That it is possible for the loss to go up? But that should be expected if you understand that the loss is only computed on some examples. Perhaps what you want to do instead is zero out the loss for those examples that are already adversarial or take the sum? You don't get the (potential?) speed up benefit though. I would note that the only thing that changes by doing the first thing is just the normalization constant (i.e., the total number of samples when averaging the loss across samples is fixed instead of changing).