i am running binary classification report. my "target" column is binary 0,1 values, "pred_lablel" is binary 01, values and "prediction" is probabilities between 0-1 On using the code snippet below:

data_def = DataDefinition( classification=[ BinaryClassification( target="target", prediction_labels="pred_label", #labels={0: "no complication", 1: "complication"} ) ] )

report is generated which includes -accuracy, precision, f1, recall.

but when i add predicted probabilities to add log loss and auc/roc to report, using code below: data_def_2 = DataDefinition( classification=[ BinaryClassification( target="target", prediction_labels="pred_label", prediction_probas="prediction", pos_label=1, #labels={0: "no complication", 1: "complication"} ) ] )

i get auc/roc, log loss but accuracy is wrong and precision, f1 recall values for same dataset as 0

ERROR: /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/sklearn/metrics/_classification.py:1565: UndefinedMetricWarning:

Precision is ill-defined and being set to 0.0 due to no predicted samples. Use zero_division parameter to control this behavior.

please help correct this.

Jun 27 '25 10:06 abhiyagupta

Hi @abhiyagupta,

Could you share the full reproducible example that includes the structure of your dataset with column names (just a dummy example with a few lines) and the exact code your run?

Jun 27 '25 10:06 elenasamuylova

dummy DataFrame

train_df = pd.DataFrame({ 'target': [0, 1, 0, 1, 1], 'prediction': [0.12, 0.85, 0.33, 0.67, 0.91], 'pred_label': [0, 1, 0, 1, 1] })

current_df = pd.DataFrame({ 'target': [0, 1, 0, 1, 1], 'prediction': [0.12, 0.85, 0.33, 0.67, 0.91], 'pred_label': [0, 1, 0, 1, 1] })

option 2 : which gives error in accuracy, precision, f1 and recall.

data_def = DataDefinition( classification=[ BinaryClassification( target="target", prediction_labels="pred_label", prediction_probas="prediction", #labels={0: "no complication", 1: "complication"} ) ] ) train_df_dataset = Dataset.from_pandas( pd.DataFrame(train_df), data_definition=data_def ) current_df_dataset = Dataset.from_pandas( pd.DataFrame(current_df), data_definition=data_def )

run classification report

report = Report(metrics=[ClassificationPreset()], include_tests=True) classification_report = report.run(reference_data=train_df_dataset, current_data=current_df_dataset)

Jun 27 '25 10:06 abhiyagupta

I run your exact code end to end and did not observe any error. Here is the Colab: https://colab.research.google.com/drive/1ZpVg7A02c0yAMjAEodRxPUdtO9f779rl

The accuracy and precision are 100% as expected given the input.

Could you clarify what the issue with this minimal example you shared? Do you get an error when you run it?

Jun 27 '25 11:06 elenasamuylova

i had shared a complete dummy data previously for exapmle.

Everything is same, only added prediction column which contains probabilites between 0-1 (float)

prediction:

Jun 27 '25 11:06 abhiyagupta

snippet of target, prediction, pred_label

Jun 27 '25 12:06 abhiyagupta

It seems that the differences in metrics come from the fact that you want to treat the predicted probability of 0.158 as class "1", whereas Evidently applies the default threshold of 0.5.

To clarify how data definition works in Evidently:

When you are dealing with non-probabilistic binary classification, you need prediction_labels.
When you are dealing with probabilistic binary classification, you need prediction_probas.

You can find examples of correct usage here: https://github.com/evidentlyai/evidently/blob/main/examples/cookbook/metrics.ipynb

By default, Evidently applies a 0.5 decision threshold to predicted probabilities. This means that if the predicted probability is 0.18, it will be treated as class 0. If all your predicted probabilities are below 0.5 (as shown in your screenshot), there will be no instances of class "1" in the dataset.

This explains why you might see different metric values depending on what you pass:

If you pass labels [1, 1], metrics are computed directly from those labels.
If you pass probas = [0.18, 0.21], those are converted to labels [0, 0] (using the default 0.5 threshold), and metrics are computed based on that.

If you want to use a different threshold for probabilistic classification, you can specify it using the probas_threshold parameter:

report = Report(metrics=[ClassificationPreset(probas_threshold=0.15)])

This sets the threshold to 0.15 instead of 0.5, and the label assignment will reflect that when computing metrics.

Jun 27 '25 14:06 elenasamuylova

Thank you! Appreciate your quick response. This helps!

Jun 30 '25 09:06 abhiyagupta

Hi, I’d love to work on reproducing this and contributing a fix for binary classification—can I take this

Jul 08 '25 17:07 Diksha-3905

Binary Classification Issue

dummy DataFrame

run classification report