pmlb icon indicating copy to clipboard operation
pmlb copied to clipboard

fetch metafeatures option in fetch_data

Open lacava opened this issue 7 years ago • 5 comments

It would be nice to add an option to fetch feature types from fetch_data.

lacava avatar Oct 12 '17 17:10 lacava

I believe that's captured in #3.

rhiever avatar Oct 12 '17 17:10 rhiever

not as a criteria though... an actual list of the types of each feature.

lacava avatar Oct 12 '17 17:10 lacava

Oh, I see. How would we accomplish that? Scrape from the README?

rhiever avatar Oct 12 '17 18:10 rhiever

Any progress on this? It shouldn't be much work using the metadata file of each dataset. I can create a draft pull request, something like:

dataset, metadata = fetch_data('adult', return_medadata=True)

However, I'm not sure what information should be included in the metadata... I can think of three possible options:

  • the whole metadata.yaml parsed into a dictionary
  • a dictionary feature -> feature_type (e.g., {"age": "continuous", "education_type": "categorical", ....})
  • a list of the feature types (e.g., ["continuous", "categorical", ....])

rrunix avatar Oct 25 '21 11:10 rrunix

Thanks for this note @rrunix. 🙏🏽 🙌🏽 @JDRomano2 would be the contact at this point, but if I may chime in: yes, a PR would be most welcome. My suggestion would be that the argument return_medadata could take 'all' (metadata.yaml parsed into a dictionary), 'features' (dictionary of features), or NA (no metadata).

trangdata avatar Oct 27 '21 16:10 trangdata