featuretools icon indicating copy to clipboard operation
featuretools copied to clipboard

Better single entity API

Open kmax12 opened this issue 5 years ago • 1 comments

If I have a single entity, it'd be great if I could just initialize an Entity and then pass the entity to DFS. We have a lot of users who only have a single table, so we an improvement to streamline the API in this case would help a lot.

Potential API


entity = ft.Entity(
            entity_id, #optional
            dataframe,
            variable_types=variable_types,
            index=index,
            time_index=time_index,
            secondary_time_index=secondary_time_index,
            make_index=make_index
)

ft.dfs(entity, cutoff_time, trans_primitives)

Quick thoughts on how to implement

  • Update the Entity API to not require a an entityset as a param
  • move methods on Entity that require the entityset to EntitySet
  • Update dfs and calculate_feature_matrix to convert the a single entity into an entityset and then run as normal.
    • Maybe disable some arguments to dfs that don't make sense in the single table case
    • before implementing, we can discuss if we should just define a new methods instead of using DFS
  • It'd be cool if I could call normalize_entity on this entity object to then create a second entity and covert it into a entityset

We should also add a documentation guide outline how to use DFS with a single table. Right now we have answers on stackoverlow and the FAQ, but this question comes up frequently.

kmax12 avatar May 29 '20 12:05 kmax12

IMHO, it makes more sense to work on a single DataFrame (entity).

  1. In real situations, the joining condition might be very complicated and come with additional filtering conditions.
  2. The DFS API makes users think of multiple things at the same. First, how to join tables. Second, how to generate features. By support better APIs on a single DataFrame (entity), users can get more focused on each step.

dclong avatar Dec 30 '20 01:12 dclong