privacy
privacy copied to clipboard
Doubt about #steps in compute_dp_sgd_privacy.py
https://github.com/tensorflow/privacy/blob/319bf81bb166f4c43ca898a509bfdc56e1558280/tensorflow_privacy/privacy/analysis/compute_dp_sgd_privacy.py#L87
About steps = int(math.ceil(epochs * n / batch_size)), I have a doubt:
intseems useless, sinceceilshould return an integer, but that's not the subject.- shouldn't it be
steps = int(epochs * math.ceil(n / batch_size)), since whenn / batch_sizeis not an integer, one more (partial) iteration thanmath.floor(n / batch_size)is done during each epoch, isn't it ? On the other hand, even if I'm right, maybe the original formula is still correct, since it calculates a number of "whole steps equivalent" ?? To illustrate what I mean: with a batch size of 256, 60000 entries in data set and 60 epochs, as 60000 / 256 = 234.375, the formula in code leads to 14063 steps (234.375 × 60 = 14062.5), when my reasoning leads to 235 × 60 = 14100. I hope I don't waste your time, but it seems worth verifying to me.
My doubts seem to be justified, see the fixed issue I registered about the clone of this script for PyTorch framework. Someone to confirm (or invalidate) this ? (@ffuuugor ? @karthikprasad ? @ilyamironov ?) Thank you !