Alexandre Sablayrolles
Alexandre Sablayrolles
@samdow thanks for the detailed explanation, that's also what I thought, it seems that there is no smarter way to avoid computation so it's probably good to keep as is....
@ffuuugor I'm thinking of committing this as an alternative grad_sampler, but keeping the unfold-based as the default one. Do we have a clean way to support multiple grad_samplers? Or should...
Thanks for flagging this. We are currently discussing with the Pytorch team as the new proposed hooks are not ideal for our use-case.
Thanks for flagging @mmsaki. Currently, it is a warning so you can safely ignore it. We are working on a solution for the next version of Pytorch.
Just wanted to point out that this is now possible with Functorch (`grad_sample_mode="no_op"`). It should be straightforward to adapt e.g. the example on CIFAR-10 to handle data augmentations.
@romovpa: I don't believe this will work because it will create "per-sample" gradients of the regularizer. @kenziyuliu: In your particular case, there should be a work-around, that consists in adding...
Closing the issue as the PR has been merged. Feel free to reopen if the fix doesn't work.
With the release of v1.2, it is now possible to have custom grad samples using Functorch, that might solve your problem @FrancescoPinto.
Closing the issue as it is solved by v1.2. If this solution does not work for you @FrancescoPinto, feel free to reopen
With Opacus, we throw away the `.grad` given by Pytorch and recompute it using per-sample gradients. However, in this case, it looks like the `.grad` would also have signal coming...