tapas
tapas copied to clipboard
Currently users have to keep their reports disaggregated by variable (e.g. generator/dataset). There should be an additional parameter in ROCReport that divides the list of summaries.
This has tighter $\varepsilon^\text{eff}$ estimation.
Currently, the default reporting format of the plots is ok, but not paper ready. There should be a way to format plots properly for papers.
The PRIVBAYES generator implemented in `reprosyn` currently requires a `seed` argument (with fixed default value). This argument makes the generator deterministic, and is problematic for, e.g., epsilon-DP estimation. `PrivE` should...
`sample` currently samples records regardless of the index. If the record with index=0 is removed, then the dataset still tries to sample data[0]. To replicate, in the example: ``` target_record...
I am pretty sure we can just remove this line no? https://github.com/alan-turing-institute/privacy-sdg-toolbox/blob/60290c76d79c2e84d70bf21a44b99c179ea11d6a/prive/attacks/set_classifiers.py#L286
possible enhancement, Add a utility to the package that would make its own json files with some sensible defaults e.g. tell me the columns that are discrete and assume everything...
The one hot implementation can be sped up a bit, by replacing the current implementation with the code below, which vectorises the lookups letting numpy do the complex work. This...
https://arxiv.org/pdf/2206.05199.pdf