featuretools icon indicating copy to clipboard operation
featuretools copied to clipboard

Ability to infer relationships between two df's

Open kmax12 opened this issue 5 years ago • 0 comments

It can be useful to be able to infer relationships between two tables. Especially as we build higher level applications on top of featuretools.

This functionality could be implemented based on rules and heuristics

Rules:

  • relationships must be between two variables of same dtype
  • the parent variable must be the index column

Heuristics:

  • relationships often taken the form of product_id --> id or product_id --> product_id-->
  • the child id shouldn't be a numeric semantic type (it's fine if the underlying data is numeric)

def recommend_relationships(entity_a, entity_b):
    """Returns potential relationships between entity a and entity b
      
       Args:
        entity_a (ft.Entity): entity a
        entity_b (ft.Entity): entity b
        
        Returns:
         List[ft.Relationship]: list of potential relationships
    """
    pass

it may also also make sense to do an api that takes in a full entityset


def recommend_relationships_entityset(entityset):
    """Returns all potential relationships between two entities in an entityset 
      
       Args:
        entityset (ft.EntitySet): entityset
        
        Returns:
         List[ft.Relationship]: list of potential relationships
    """
    pass

kmax12 avatar Mar 24 '20 15:03 kmax12