evidently icon indicating copy to clipboard operation
evidently copied to clipboard

inconsistent dataset_drift value in the data drift report

Open userkkw opened this issue 1 year ago • 1 comments

Hi, inside data_drift_report, there are two metrics (DatasetDriftMetric and DataDriftTable) reporting dataset_drift. Contrary to the DatasetDriftMetric, One cannot set the drift_share value for DataDriftTable. There are situations that dataset_drift is true for DatasetDriftMetric while false for DataDriftTable. It caused some confusion to some data scientists in my project. Screenshot 2023-07-14 at 09 43 35

I attached a screen shot above. Inside the given google colab example, I set the drift_share = 0.4, and the dataset_drift is inconsistent. I also briefly checked the source code, it seems for DataDriftTable function https://github.com/evidentlyai/evidently/blob/174bde94d7ba682e125395bfe100b4d4b99f74e2/src/evidently/metrics/data_drift/data_drift_table.py#L43, it does not pass drift_share value anywhere while it outputs the dataset_drift value. Would it be better to remove dataset_drift for DataDriftTable function?

userkkw avatar Jul 14 '23 07:07 userkkw

Thanks for sharing @userkkw - it would indeed make sense to pass the drift_share parameter to the DataDriftTable. We will address this in one of the following releases.

elenasamuylova avatar Jul 18 '23 16:07 elenasamuylova