SDMetrics icon indicating copy to clipboard operation
SDMetrics copied to clipboard

Does `CSTest` quantify the synthesis of missing values?

Open npatki opened this issue 4 years ago • 0 comments

If I have a table with some missing values, I want to synthesize data with missing values too -- ideally in the same ratio. I'm curious whether CSTest is an appropriate signal of this? If it isn't, should we modify it to be?

Details: From the API reference

This function applies the single column CSTest metric to all the discrete columns found in the table and then returns the average of all the scores obtained.

I know that the SDV internally creates a new, discrete binary column representing whether a column is null. But I don't now if this column is used in the CSTest computation because it's dropped before returning the synthetic data.

npatki avatar Aug 10 '21 16:08 npatki