openml-python icon indicating copy to clipboard operation
openml-python copied to clipboard

Use name instead of interger ids

Open ArlindKadra opened this issue 4 years ago • 4 comments

Description

Use names instead of integer ids when possible for better readability (e.g., for datasets at least it is possible to use names of the dataset).

ArlindKadra avatar Apr 26 '21 07:04 ArlindKadra

Should we consider stating the id in the tutorial text/comments? That way if the dataset is not correctly found by name (e.g. because a newer version is uploaded which breaks the example), users can refer to the id?

PGijsbers avatar Apr 28 '21 12:04 PGijsbers

Hey @PGijsbers, I had the same thought and I fully agree with it. I will do that too.

ArlindKadra avatar Apr 28 '21 12:04 ArlindKadra

    If dataset is retrieved by name, a version may be specified.
    If no version is specified and multiple versions of the dataset exist,
    the earliest version of the dataset that is still active will be returned.
    If no version is specified, multiple versions of the dataset exist and
    ``exception_if_multiple`` is set to ``True``, this function will raise an exception.

On another note, reading the documentation, I think we can put the dataset version in the function call, that should also work. We can even give the error_if_multiple its default value for users to understand that the function has quite a bit to offer.

ArlindKadra avatar Apr 28 '21 22:04 ArlindKadra

On another note, reading the documentation, I think we can put the dataset version in the function call, that should also work.

Yes, that sounds like a good idea.

We can even give the error_if_multiple its default value for users to understand that the function has quite a bit to offer.

You could add that to the datasets example.

mfeurer avatar Apr 29 '21 07:04 mfeurer