SACRO-ML icon indicating copy to clipboard operation
SACRO-ML copied to clipboard

Likelihood attack reproducibility

Open albacrespi opened this issue 1 year ago • 5 comments

How to make the data splits of the LIRA reproducible? I noticed in the code it used random to generate indices to select rows of data, see line 303

these_idx = np.random.choice(indices, n_train_rows, replace=False)

Would be saving "these_idx" enough to count as reproducible? Or another way of solving this problem would be, for example, by saving the probabilities calculated by the shadow models (causing a potential issue for disk storage).

Also I noted a comment on lines 324 to 326 saying that some classes might not be represented in the split. Can something be done to avoid it as much as possible?

If we make the data splits reproducible, does LIRA still make sense?

albacrespi avatar Aug 16 '23 15:08 albacrespi