openml-r icon indicating copy to clipboard operation
openml-r copied to clipboard

Documentation of listing data.frames

Open HeidiSeibold opened this issue 7 years ago • 6 comments

E.g. for listOMLTasks the Value documentation is quite scarce

Value
[data.frame].

It would be nice to know what the collumns of the data.frame actually are or at least a link to where I can find this.

Since this is an issue that I just ran into myself, can anyone tell me what max.nominal.att.distinct.values means:question:

HeidiSeibold avatar Jan 27 '17 15:01 HeidiSeibold

@joaquinvanschoren it seems a bit stupid to doc this in R, redundantly.

are the docs for the meta features online so we can link to them in the R docs?

berndbischl avatar Jan 27 '17 15:01 berndbischl

it is apparently under "measures" (what i dislike)

https://www.openml.org/search?type=measure

also for heidi's example it simply says:

MaxNominalAttDistinctValues DataQuality extracted from Fantail Library

:(

@HeidiSeibold It will be the max number of levels for a categorical input feature.

but in general it does not help us if i have to answer that manually. can we at least link to the fantail docs then?

berndbischl avatar Jan 27 '17 15:01 berndbischl

Thanks @berndbischl

I agree, a link should be good, but then the documentation on the website needs to be informative.

HeidiSeibold avatar Jan 27 '17 15:01 HeidiSeibold

Are these two separate issues?

  • Value documentation
  • Data quality descriptions

The data quality list is indeed under measures. I can add a shortcut link if you want. Where do you want it?

I could also split up the measures index into data qualities, evaluation measures, and estimation procedures, but that would be at least a day of work. If you think it really helps I can try to make time.

Fantail is available here: http://fantail.quansun.com/

There is no documentation on the Fantail meta-features, and thus nothing to link to. Even Quan Sun's thesis only has a list of them without description. They are quite straightforward, but someone has to go over them and add a good description. Shall we open up a Google Doc for that?

Thank you, Joaquin

joaquinvanschoren avatar Jan 27 '17 16:01 joaquinvanschoren

The data quality list is indeed under measures. I can add a shortcut link if you want. Where do you want it?

i would like to have the performance metrics and the data qualities simply in 2 different sections, also in the navigation menu.

I could also split up the measures index into data qualities, evaluation measures, and estimation procedures, but that would be at least a day of work. If you think it really helps I can try to make time.

i do think that the current state is a bit confusing and such a split-up would help

There is no documentation on the Fantail meta-features, and thus nothing to link to. Even Quan Sun's thesis only has a list of them without description. They are quite straightforward, but someone has to go over them and add a good description. Shall we open up a Google Doc for that?

without a description they borderline useless IMHO .... shall someone work on this during the workshop? and how do you do this without fantail docs....? i mean, are you sure that you know all of the definitions PRECISELY?

berndbischl avatar Jan 27 '17 17:01 berndbischl

Can you create an issue to create those separate indices?

The exact definitions of most of these meta-features are in my PhD thesis :). I assume that Fantail has implemented them exactly. Most of the others are simply landmarkers. I'm happy to take an hour at the workshop to clean this up. If you want something more fundamental, like a wiki, that is also possible.

On Fri, Jan 27, 2017 at 6:26 PM Bernd Bischl [email protected] wrote:

The data quality list is indeed under measures. I can add a shortcut link if you want. Where do you want it?

i would like to have the performance metrics and the data qualities simply in 2 different sections, also in the navigation menu.

I could also split up the measures index into data qualities, evaluation measures, and estimation procedures, but that would be at least a day of work. If you think it really helps I can try to make time.

i do think that the current state is a bit confusing and such a split-up would help

There is no documentation on the Fantail meta-features, and thus nothing to link to. Even Quan Sun's thesis only has a list of them without description. They are quite straightforward, but someone has to go over them and add a good description. Shall we open up a Google Doc for that?

without a description they borderline useless IMHO .... shall someone work on this during the workshop? and how do you do this without fantail docs....? i mean, are you sure that you know all of the definitions PRECISELY?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/openml/openml-r/issues/304#issuecomment-275722369, or mute the thread https://github.com/notifications/unsubscribe-auth/ABpQV30mxL0cXpLmAzJMEnlu997FhFhsks5rWiizgaJpZM4Lv3RC .

-- Thank you, Joaquin

joaquinvanschoren avatar Jan 28 '17 00:01 joaquinvanschoren