openml-python icon indicating copy to clipboard operation
openml-python copied to clipboard

Invalid estimation procedure ids

Open ArturDev42 opened this issue 2 years ago • 2 comments

Description

On https://www.openml.org/search?type=measure&measure_type=estimation_procedure I can see the different estimation procedure ids. When creating a new task on the test server as follows

test_task = openml.tasks.create_task(
        task_type=TaskType.SUPERVISED_CLASSIFICATION,
        dataset_id=128,
        target_name="class",
        evaluation_measure="predictive_accuracy",
        estimation_procedure_id=11,
    )

with for example estimation_procedure_id=11 which according to the list corresponds to '10% Holdout set', the following error is thrown:

OpenMLServerException: https://test.openml.org/api/v1/xml/task/ returned code 622: 
Input value does not match allowed values in foreign column. - 
problematic input: [estimation_procedure], acceptable inputs: [1, 2, 3, 4, 5, 6, 16, 23, 25, 26, 28]

If instead I use for example estimation_procedure_id=25, I receive the error as described in openml/OpenML#1190. My used version are the same as mentioned in openml/OpenML#1190.

Improvement of docs

I think it would be great to show the list of possible estimation procedures in a tutorial somewhere. I noticed that such a list exists from the comment mentioned in https://openml.github.io/openml-python/develop/examples/30_extended/task_manual_iteration_tutorial.html but I only by chance noticed that I need to click on 'Show list' to see the procedure ids. Would also be great to be able to query that information via the API.

Is it also possible to manually create new estimation procedures?

Thanks!

ArturDev42 avatar Apr 21 '23 17:04 ArturDev42

Hi @ArturDev42, Thank you very much for raising this issue. Yes, such lists are available, but very hidden and only accessible by using the API:

As you can see, these estimation procedures vary quite a bit from the live to the test server.

To ultimately improve the situation, do you think it would make sense to rather create an API call that returns the estimation procedures? Then one can look up the current definitions dynamically.

And to answer your question, no, unfortunately, this is not possible.

mfeurer avatar Jun 12 '23 11:06 mfeurer