openml-python icon indicating copy to clipboard operation
openml-python copied to clipboard

Bug: publishing run using documentation example throws Arff error

Open josvandervelde opened this issue 1 year ago • 2 comments
trafficstars

Description

Running the example code found at https://openml.github.io/openml-python/main/ throws an error.

Steps/Code to Reproduce

import openml
from sklearn import impute, tree, pipeline

clf = pipeline.Pipeline(
    steps=[
        ('imputer', impute.SimpleImputer()),
        ('estimator', tree.DecisionTreeClassifier())
    ]
)
task = openml.tasks.get_task(32)
run = openml.runs.run_model_on_task(clf, task)
run.publish()

I'm running this using https://github.com/openml/services/.

Expected Results

No error is thrown, run is uploaded.

Actual Results

openml.exceptions.OpenMLServerException: http://nginx:80/api/v1/xml/run/ returned code 209: Error parsing uploaded file. - Arff error in predictions file: invalid value for nominal attribute: 1 (l.19) 

If you look at the the predictions.arff, it's indeed wrong:

[...]
@ATTRIBUTE repeat NUMERIC
@ATTRIBUTE fold NUMERIC
@ATTRIBUTE sample NUMERIC
@ATTRIBUTE row_id NUMERIC
@ATTRIBUTE prediction {tested_negative, tested_positive}
@ATTRIBUTE correct {tested_negative, tested_positive}
@ATTRIBUTE confidence.tested_negative NUMERIC
@ATTRIBUTE confidence.tested_positive NUMERIC

@DATA
0,0,0,53,1,1,0,0
[...]

Versions

Linux-6.8.0-45-generic-x86_64-with-glibc2.36
Python 3.10.15 (main, Sep 27 2024, 06:07:24) [GCC 12.2.0]
NumPy 2.1.1
SciPy 1.14.1
Scikit-Learn 1.5.2
OpenML 0.15.0

josvandervelde avatar Oct 15 '24 13:10 josvandervelde