privacy icon indicating copy to clipboard operation
privacy copied to clipboard

Doubt about #steps in compute_dp_sgd_privacy.py

Open jmg-74 opened this issue 5 years ago • 1 comments

https://github.com/tensorflow/privacy/blob/319bf81bb166f4c43ca898a509bfdc56e1558280/tensorflow_privacy/privacy/analysis/compute_dp_sgd_privacy.py#L87

About steps = int(math.ceil(epochs * n / batch_size)), I have a doubt:

  • int seems useless, since ceil should return an integer, but that's not the subject.
  • shouldn't it be steps = int(epochs * math.ceil(n / batch_size)), since when n / batch_size is not an integer, one more (partial) iteration than math.floor(n / batch_size) is done during each epoch, isn't it ? On the other hand, even if I'm right, maybe the original formula is still correct, since it calculates a number of "whole steps equivalent" ?? To illustrate what I mean: with a batch size of 256, 60000 entries in data set and 60 epochs, as 60000 / 256 = 234.375, the formula in code leads to 14063 steps (234.375 × 60 = 14062.5), when my reasoning leads to 235 × 60 = 14100. I hope I don't waste your time, but it seems worth verifying to me.

jmg-74 avatar May 01 '20 17:05 jmg-74

My doubts seem to be justified, see the fixed issue I registered about the clone of this script for PyTorch framework. Someone to confirm (or invalidate) this ? (@ffuuugor ? @karthikprasad ? @ilyamironov ?) Thank you !

jmg-74 avatar Jun 17 '20 13:06 jmg-74