featuretools icon indicating copy to clipboard operation
featuretools copied to clipboard

Support Datetime Feature +/- pd.Timedelta

Open wujunzhuo opened this issue 6 years ago • 7 comments
trafficstars

Bug/Feature Request Description

In [1]: import featuretools as ft                                                                                                                             

In [2]: es = ft.demo.load_mock_customer(return_entityset=True)                                                                                                

In [3]: import pandas as pd                                                                                                                                   

In [4]: f = ft.Feature(es['customers']['date_of_birth'])                                                                                                      

In [5]: ft.calculate_feature_matrix([f + pd.Timedelta(1, 'y')], es)                                                                                           
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-5-340973cfbfbd> in <module>
----> 1 ft.calculate_feature_matrix([f + pd.Timedelta(1, 'y')], es)

/usr/local/lib/python3.7/site-packages/featuretools/feature_base/feature_base.py in __add__(self, other)
    242     def __add__(self, other):
    243         """Add other"""
--> 244         return self._handle_binary_comparision(other, primitives.AddNumeric, primitives.AddNumericScalar)
    245 
    246     def __radd__(self, other):

/usr/local/lib/python3.7/site-packages/featuretools/feature_base/feature_base.py in _handle_binary_comparision(self, other, Primitive, PrimitiveScalar)
    214             return Feature([self, other], primitive=Primitive)
    215 
--> 216         return Feature([self], primitive=PrimitiveScalar(other))
    217 
    218     def __eq__(self, other):

/usr/local/lib/python3.7/site-packages/featuretools/feature_base/feature_base.py in __new__(self, base, entity, groupby, parent_entity, primitive, use_previous, where)
    733                                                primitive=primitive,
    734                                                groupby=groupby)
--> 735             return TransformFeature(base, primitive=primitive)
    736 
    737         raise Exception("Unrecognized feature initialization")

/usr/local/lib/python3.7/site-packages/featuretools/feature_base/feature_base.py in __init__(self, base_features, primitive, name)
    637                                                relationship_path=RelationshipPath([]),
    638                                                primitive=primitive,
--> 639                                                name=name)
    640 
    641     @classmethod

/usr/local/lib/python3.7/site-packages/featuretools/feature_base/feature_base.py in __init__(self, entity, base_features, relationship_path, primitive, name)
     53         self._name = name
     54 
---> 55         assert self._check_input_types(), ("Provided inputs don't match input "
     56                                            "type requirements")
     57 

AssertionError: Provided inputs don't match input type requirements

wujunzhuo avatar Jun 28 '19 06:06 wujunzhuo

hi @wujunzhuo - thanks for the suggestion!

We can look into supporting this. Can you explain the use case for doing this while performing feature engineering?

kmax12 avatar Jun 28 '19 13:06 kmax12

hi @wujunzhuo - thanks for the suggestion!

We can look into supporting this. Can you explain the use case for doing this while performing feature engineering?

Thanks for your reply.

We're aiming to build a number of features in different time periods, like Average Income In Recent 1/5/10 Years, 1/3/7-day Login Times After First Registration, etc.

First We tried the use_previous mechanism. However, features are not based on a single cutoff_time; therefore we might need to call the calculate_feature_matrix many times. Instead we put these time durations as feature where conditions, and without the add/subtract operands support, we simply convert the Datetime type to Numeric(unix timestamp).

😊

wujunzhuo avatar Jul 01 '19 09:07 wujunzhuo

@wujunzhuo does that mean you would also need the date time features to work with the <, <=, >, and >= operators?

if possible, can you share a example snippet of code you're using to do this to confirm we understand what you're requesting

kmax12 avatar Jul 01 '19 17:07 kmax12

This could probably be achieved by updating AddNumericScalar and similar primitives to support Numeric or Datetime inputs

rwedge avatar May 21 '20 16:05 rwedge

Hey there, I want to work on this issue. I have created a pull request and linked that to this issue, I have tested the code given by OP and now it is working fine without throwing errors like earlier. Could you please review my PR and let me know if there is anything else to be done before it can be merged?

rohit901 avatar Aug 30 '21 15:08 rohit901

@ozzieD FYI once you start working on this issue: https://github.com/alteryx/featuretools/pull/1657

gsheni avatar Aug 02 '22 17:08 gsheni