openml-python
openml-python copied to clipboard
Cannot handle supervised tasks with estimation type as "Test on Training Data"
Description
Certain tasks exist on the OpenML server with invalid or illegal task estimation type that results in empty task estimation parameters.
Steps/Code to Reproduce
import openml
from sklearn.svm import SVC
openml.config.start_using_configuration_for_example()
clf = SVC()
task = openml.tasks.get_task(1202)
run = openml.runs.run_model_on_task(model=clf, task=task)
run.publish() # fails
Expected results
publish()
should work successfully.
Misc. info
The list of task IDs on the production server with the estimation_procedure
type as testontrainingdata
:
# len(prod_tids) = 7
prod_tids = [168867, 189921, 211700, 211982, 233159, 233160, 317601, 317611, 360876]
The list of task IDs on the test server with the estimation_procedure
type as testontrainingdata
:
# len(test_tids) = 102
test_tids = [1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1202]
On both servers, no run exists for these list of tasks.
What is the produced error? The fact that no such runs exist on the server points towards a server error rather than an openml-python
error. Can you confirm which end is responsible for the failed upload?
What is the produced error?
OpenMLServerException: https://test.openml.org/api/v1/xml/run/ returned code 216: Error processing output data: illegal combination of evaluation measure attributes (repeat, fold, sample) - Measure(s): usercpu_time_millis_training(0, 0), wall_clock_time_millis_training(0, 0), usercpu_time_millis_testing(0, 0), usercpu_time_millis(0, 0), wall_clock_time_millis_testing(0, 0), wall_clock_time_millis(0, 0), predictive_accuracy(0, 0)
I too think it is a server-side error and posted this issue to confirm the same.
However, I am not sure if the predictions we generate locally are as the server expects since this task type is not something common. The test split contains the full dataset.
@mfeurer
Hey @janvanrijn, any insight on this would be nice!