SDMetrics
SDMetrics copied to clipboard
Metrics to evaluate quality and efficacy of synthetic datasets.
### Environment details If you are already running SDMetrics, please indicate the following details about the environment in which you are running it: * SDMetrics version: 2.4.2-dev0 * Python version:...
### Problem Description The `ColumnPairMetric` and `SingleColumnMetric` metrics should verify that the input data is of the expected data type for the metric, and raise a user-friendly error if not....
### Problem Description Right now the metrics are computed based on real data vs synthetic data for [ML efficacy](https://sdv.dev/SDV/user_guides/evaluation/single_table_metrics.html#machine-learning-efficacy-metrics). While this information is perfect to gauge if a model could...
### Problem Description As a user, I want only the relevant errors surfaced to me and expected behavior to be suppressed. For now, focus on the KS Test metric (see...
Currently `SDMetrics` only provides the two samples KS test to compare numerical values. We should consider adding other tests as an optional parameter, so the user can choose a test...
### Problem Description Current time series metrics in SDMetrics are detection/classifier based. It would be beneficial to have a metric that assesses the quality of the synthetic time series and...
And vice-versa. Currently if the wrong datatype is passed it will simply return `nan`. It should raise an error instead. Below is code to reproduce this phenomena: ```python3 import pandas...
### Problem Description As mentioned in #70, the current implementation of `CSTest` might not be entirely correct. Before applying `CSTest`, we currently normalize the frequencies of each category but in...
NaN values should be supported by numerical privacy metrics, but currently it raises `ValueError: Input contains NaN, infinity or a value too large for dtype('float64').` The code below reproduces this...
### Environment Details * SDV version: 0.13.0 * Python version: 3.8.9 * Operating System: MacOS ### Error Description The Numerical Privacy Metrics throw an error whenever the target columns (sensitive_fields)...