openml-python
openml-python copied to clipboard
get_metric_fn can't handle classification metrics using probabilities
import openml
import sklearn
import numpy as np
from sklearn.impute import SimpleImputer
from sklearn.tree import DecisionTreeClassifier
openml.config.apikey = "KEY"
# build a scikit-learn classifier
clf = sklearn.pipeline.make_pipeline(SimpleImputer(),
DecisionTreeClassifier())
task_id = 189871
task = openml.tasks.get_task(task_id) # download the OpenML task
run = openml.runs.run_model_on_task(clf, task) # run the classifier on the task
# ok:
#score = run.get_metric_fn(sklearn.metrics.accuracy_score) # print accuracy score
# fails, even if pass labels
score = run.get_metric_fn(sklearn.metrics.log_loss, kwargs=dict(labels=task.class_labels))
print('Data set: %s; Accuracy: %0.2f' % (task.get_dataset().name, float(np.mean(score))))
fails with:
ValueError: The number of classes in labels is different from that in y_pred. Classes found in labels: ['0' '1' '2' '3' '4']
The problem is is old (2017) when function first made: https://github.com/openml/openml-python/commit/1c285a803b58dca963e4c51930251ac334d94d19
This block:
https://github.com/openml/openml-python/blob/develop/openml/runs/run.py#L489-L492
gets an index of the prediction, but this is not the correct thing to pass to log_loss.
There are multiple problems:
- A classification metric like log_loss should take in probabilities, never labels, but this is impossible to do
- The original labels should be what is predicted and passed to any metric, as defined by the task class labels, not the index.
I think the only solution is to avoid the helpers in openml-python and directly manage the splits etc.
FYI: https://github.com/openml/openml-python/pull/1140 Just to show how it can be made to work. Not necessarily elegent.
In the end I couldn't make this helper work properly. The scores are just wrong, e.g. AUC of 1. Had to use "plain" way of just asking task for splits and doing them myself, like many other papers have done.
E.g. even random forest was giving AUC=1 for KDDCup09_appetency with task_id 75105
Does the code you provide above trigger this issue together with #1140?
Does the code you provide above trigger this issue together with #1140?
The PR is an attempted work-around, but it's incomplete and doesn't really work in general.
As for the original problem, ya the code snippet above shows the problem.
- add deprecation message for 0.13.x
- fix with proposal in #1140