openml-python Report minimal evaluation results in run printout

Report minimal evaluation results in run printout

Open joaquinvanschoren opened this issue 3 years ago • 0 comments

Description

When doing print(run), the result is only some basic info like the task type ID. It shows no aggregated evaluation information. It would be nice if that could be added to the printout.

Alternatively, add a function that returns a dictionary with the local evaluation results. If you currently run a run locally (not downloaded from OpenML), run.evaluations is None. run.fold_evaluations, however, contains accuracy and runtime results, but in a very inconvenient format.

Steps/Code to Reproduce

from sklearn import ensemble
from openml import tasks, runs

clf = ensemble.RandomForestClassifier()
task = tasks.get_task(3954)
run = runs.run_model_on_task(clf, task, avoid_duplicate_runs=False)
print(run)

Expected Results

Some basic evaluation info. E.g. similar to what the R API returns:

OpenML Run NA :: (Task ID = 3954, Flow ID = NA)

$bmr
task.id                    learner.id                        acc.test.join          timetrain.test.sum
1 MagicTelescope classif.randomForest     0.8831756            150.753

Actual Results

Only some basic info

OpenML Run
==========
Uploader Name: None
Metric.......: None
Run ID.......: None
Task ID......: 3954
Task Type....: None
Task URL.....: https://www.openml.org/t/3954
Flow ID......: None
Flow Name....: sklearn.ensemble._forest.RandomForestClassifier
Flow URL.....: https://www.openml.org/f/None
Setup ID.....: None
Setup String.: Python_3.7.13. Sklearn_1.0.2. NumPy_1.21.6. SciPy_1.4.1.
Dataset ID...: 1120
Dataset URL..: https://www.openml.org/d/1120

Partial solution

This returns the accuracy and runtime for a run (even if run only locally)

def summary(metric):
  mean = np.mean(list(run.fold_evaluations[metric][0].values()))
  var = np.std(list(run.fold_evaluations[metric][0].values()))
  return "{:.4f} +- {:.4f}".format(mean,var)
print("Accuracy: ",summary('predictive_accuracy'))
print("Time (ms): ",summary('usercpu_time_millis'))

Versions

Linux-5.4.188+-x86_64-with-Ubuntu-18.04-bionic Python 3.7.13 (default, Apr 24 2022, 01:04:09) [GCC 7.5.0] NumPy 1.21.6 SciPy 1.4.1 Scikit-Learn 1.0.2 OpenML 0.12.2

Jun 24 '22 12:06 joaquinvanschoren

openml-python openml-python copied to clipboard

Report minimal evaluation results in run printout

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Partial solution

Versions

openml-python
openml-python copied to clipboard