featuretools
featuretools copied to clipboard
Better single entity API
If I have a single entity, it'd be great if I could just initialize an Entity and then pass the entity to DFS. We have a lot of users who only have a single table, so we an improvement to streamline the API in this case would help a lot.
Potential API
entity = ft.Entity(
entity_id, #optional
dataframe,
variable_types=variable_types,
index=index,
time_index=time_index,
secondary_time_index=secondary_time_index,
make_index=make_index
)
ft.dfs(entity, cutoff_time, trans_primitives)
Quick thoughts on how to implement
- Update the
EntityAPI to not require a an entityset as a param - move methods on
Entitythat require the entityset toEntitySet - Update
dfsandcalculate_feature_matrixto convert the a single entity into an entityset and then run as normal.- Maybe disable some arguments to dfs that don't make sense in the single table case
- before implementing, we can discuss if we should just define a new methods instead of using DFS
- It'd be cool if I could call
normalize_entityon this entity object to then create a second entity and covert it into a entityset
We should also add a documentation guide outline how to use DFS with a single table. Right now we have answers on stackoverlow and the FAQ, but this question comes up frequently.
IMHO, it makes more sense to work on a single DataFrame (entity).
- In real situations, the joining condition might be very complicated and come with additional filtering conditions.
- The DFS API makes users think of multiple things at the same. First, how to join tables. Second, how to generate features. By support better APIs on a single DataFrame (entity), users can get more focused on each step.