GAMLET icon indicating copy to clipboard operation
GAMLET copied to clipboard

Add similarity search of datasets in OpenML database

Open MorrisNein opened this issue 1 year ago • 0 comments

We may add a similarity search that functions the following way:

  1. Extract necessary metafeatures from a custom dataset (implement corresponding MetaFeaturesExtractor in #3 or in another PR)
  2. Load the full OpenML datasets database
  3. Find the N nearest datasets by DatasetsSimilarityAssessor
  4. Load all evaluations for the closest datasets from OpenML datasets (implement corresponding ModelsLoader in #3). Filter out the best M models for each dataset
  5. Provide a final report to the user

MorrisNein avatar Jul 14 '23 12:07 MorrisNein