privacy
privacy copied to clipboard
Incorrect comparison between privacy amplification by iteration and DP-SGD
In mnist_lr_tutorial.py, the computation of the DP budget for the DP-SGD algorithm (here: line 177) is missing a scaling of noise_multiplier by batch_size to match the computation for the amplification-by-iteration analysis (here: line 165).
Indeed, the current implementation of priv-by-iter adds Gaussian noise of scale noise_multiplier to the average gradient, rather than to the sum of gradients (as in the DP-SGD implementation). This is a consequence of setting num_microbatches=1 in the optimizer for performance reasons.
(to be precise, the loss gets reshaped to size [1, batch_size*grad_dimension] here, and then this entire loss gets selected and averaged here. After the noise is added, the gradient is "normalized" here by dividing by num_microbatches=1.
To compensate for this, the analysis of priv-by-iter correctly scales the noise_multiplier by the inverse batch size, so as to consider the scale of the noise that is added to the average gradient.
But the same should be done for DP-SGD to get a meaningful comparison.