evalml icon indicating copy to clipboard operation
evalml copied to clipboard

Multi-table support for featuretools component

Open bchen1116 opened this issue 4 years ago • 1 comments
trafficstars

Extension of issue 470. PR 1454 addresses adding the FeatureTools component, but only handles single dataframes/datatables. In order to use FeatureTools fully, we want to be able to use it to combine multiple datasets.

This issue tracks the implementation of that feature. We want to either allow EvalML to take in EntitySets as args, or we want to take in the appropriate args such that we can combine the datatables and create the EntitySets under the hood.

Original quip doc here

bchen1116 avatar Nov 23 '20 16:11 bchen1116

After discussion with @dsherry, we decided on

  • Merge Featuretools (single dataframe/datatable support) into Evalml components (not AutoML), create functionality for ingesting EntitySets (as support for multiple datatables/dataframes),
  • afterwards, we have two potential options
    • do component caching for just the featuretools component
    • create caching for pipelines once Pipelines as DAGs is finished

Next steps: Discuss options with FT team, redo perf tests on initial featuretools component implementation, update quip doc

bchen1116 avatar Dec 04 '20 16:12 bchen1116