openml-python Cannot handle supervised tasks with estimation type as "Test on Training Data"

Description

Certain tasks exist on the OpenML server with invalid or illegal task estimation type that results in empty task estimation parameters.

Steps/Code to Reproduce

import openml
from sklearn.svm import SVC


openml.config.start_using_configuration_for_example()

clf = SVC()
task = openml.tasks.get_task(1202)
run = openml.runs.run_model_on_task(model=clf, task=task)

run.publish()  # fails

Expected results

publish() should work successfully.

Misc. info

The list of task IDs on the production server with the estimation_procedure type as testontrainingdata:

# len(prod_tids) = 7
prod_tids = [168867, 189921, 211700, 211982, 233159, 233160, 317601, 317611, 360876]

The list of task IDs on the test server with the estimation_procedure type as testontrainingdata:

# len(test_tids) = 102
test_tids = [1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1202]

On both servers, no run exists for these list of tasks.

May 11 '21 13:05 Neeratyoy

What is the produced error? The fact that no such runs exist on the server points towards a server error rather than an openml-python error. Can you confirm which end is responsible for the failed upload?

May 11 '21 13:05 PGijsbers

What is the produced error?

OpenMLServerException: https://test.openml.org/api/v1/xml/run/ returned code 216: Error processing output data: illegal combination of evaluation measure attributes (repeat, fold, sample) - Measure(s): usercpu_time_millis_training(0, 0), wall_clock_time_millis_training(0, 0), usercpu_time_millis_testing(0, 0), usercpu_time_millis(0, 0), wall_clock_time_millis_testing(0, 0), wall_clock_time_millis(0, 0), predictive_accuracy(0, 0)

I too think it is a server-side error and posted this issue to confirm the same.

However, I am not sure if the predictions we generate locally are as the server expects since this task type is not something common. The test split contains the full dataset.

@mfeurer

May 12 '21 08:05 Neeratyoy

Hey @janvanrijn, any insight on this would be nice!

May 12 '21 08:05 Neeratyoy

openml-python openml-python copied to clipboard

Cannot handle supervised tasks with estimation type as "Test on Training Data"

Description

Steps/Code to Reproduce

Expected results

Misc. info

openml-python
openml-python copied to clipboard