SDMetrics icon indicating copy to clipboard operation
SDMetrics copied to clipboard

Numerical data passed to a categorical privacy metric should raise an error

Open fealho opened this issue 4 years ago • 0 comments
trafficstars

And vice-versa. Currently if the wrong datatype is passed it will simply return nan. It should raise an error instead.

Below is code to reproduce this phenomena:

import pandas as pd
from sdmetrics.single_table.privacy import CategoricalCAP


data = pd.DataFrame({   # data containing only numerical values
    'key': [1.4, 10.12, 3.4],
    'sensitive': [10.9, 9.8, 8.8]
})

score = CategoricalCAP.compute(  # privacy metric that's supposed to only work with categorical values
    data,
    data, 
    key_fields=['key'],
    sensitive_fields=['sensitive']
)

print(score) # this will print `nan`

fealho avatar Mar 26 '21 01:03 fealho