evidently
evidently copied to clipboard
inconsistent dataset_drift value in the data drift report
Hi, inside data_drift_report, there are two metrics (DatasetDriftMetric
and DataDriftTable
) reporting dataset_drift
. Contrary to the DatasetDriftMetric
, One cannot set the drift_share
value for DataDriftTable
. There are situations that dataset_drift
is true for DatasetDriftMetric
while false for DataDriftTable
. It caused some confusion to some data scientists in my project.
I attached a screen shot above. Inside the given google colab example, I set the drift_share = 0.4
, and the dataset_drift
is inconsistent. I also briefly checked the source code, it seems for DataDriftTable
function https://github.com/evidentlyai/evidently/blob/174bde94d7ba682e125395bfe100b4d4b99f74e2/src/evidently/metrics/data_drift/data_drift_table.py#L43, it does not pass drift_share
value anywhere while it outputs the dataset_drift
value. Would it be better to remove dataset_drift
for DataDriftTable
function?
Thanks for sharing @userkkw - it would indeed make sense to pass the drift_share
parameter to the DataDriftTable
. We will address this in one of the following releases.