featuretools icon indicating copy to clipboard operation
featuretools copied to clipboard

allow where_primitives to function independently of agg_primitives

Open nitinmnsn opened this issue 3 years ago • 4 comments

I am using official prediction of customer churn example from here

For quick experimentation, I have added a cell between cel 19 and cell 20 to subset the cutoff_times to include only two msno (IDs). Like so:

cutoff_times_=cutoff_times.iloc[[33,34,21,22],:].reset_index(drop=True)

cutoff_times_ = cutoff_times_.rename(columns={'cutoff_time':'time'})

Then in cell 20, I notice I don't get where clause features made for all set(where_primitives) - set(agg_primitives) where primitives. I also get warnings.warn(warning_msg, UnusedPrimitiveWarning) for all the primitives that are there in the where_primitives list but not in the agg_primitives list.

Attaching a few examples (I have changed the max_depth to 10 to make sure that insufficient depth is not the cause): 1.

feature_defs,_ = ft.dfs(entityset=es, target_entity='members',
                      agg_primitives = [],
                      trans_primitives = ['month'],
                        cutoff_time_in_index = True,
                      cutoff_time = cutoff_times_,
                      where_primitives = ['max'],
                      max_depth=10, features_only=False)

output: 
/home/nitin/miniconda3/envs/featuretools/lib/python3.9/site-packages/featuretools/synthesis/dfs.py:307: UnusedPrimitiveWarning: Some specified primitives were not used during DFS:
  where_primitives: ['max']
This may be caused by a using a value of max_depth that is too small, not setting interesting values, or it may indicate no compatible variable types for the primitive were found in the data.
  warnings.warn(warning_msg, UnusedPrimitiveWarning)
feature_defs,_ = ft.dfs(entityset=es, target_entity='members',
                      agg_primitives = ['sum'],
                      trans_primitives = ['month'],
                        cutoff_time_in_index = True,
                      cutoff_time = cutoff_times_,
                      where_primitives = ['max','min'],
                      max_depth=10, features_only=False)
output:
/home/nitin/miniconda3/envs/featuretools/lib/python3.9/site-packages/featuretools/synthesis/dfs.py:307: UnusedPrimitiveWarning: Some specified primitives were not used during DFS:
  where_primitives: ['max', 'min']
This may be caused by a using a value of max_depth that is too small, not setting interesting values, or it may indicate no compatible variable types for the primitive were found in the data.
  warnings.warn(warning_msg, UnusedPrimitiveWarning)
feature_defs,_ = ft.dfs(entityset=es, target_entity='members',
                      agg_primitives = ['sum','min'],
                      trans_primitives = ['month'],
                        cutoff_time_in_index = True,
                      cutoff_time = cutoff_times_,
                      where_primitives = ['max','min'],
                      max_depth=10, features_only=False)

output:
/home/nitin/miniconda3/envs/featuretools/lib/python3.9/site-packages/featuretools/synthesis/dfs.py:307: UnusedPrimitiveWarning: Some specified primitives were not used during DFS:
  where_primitives: ['max']
This may be caused by a using a value of max_depth that is too small, not setting interesting values, or it may indicate no compatible variable types for the primitive were found in the data.
  warnings.warn(warning_msg, UnusedPrimitiveWarning)

feature_defs,_ = ft.dfs(entityset=es, target_entity='members',
                      agg_primitives = ['sum','min','max'],
                      trans_primitives = ['month'],
                        cutoff_time_in_index = True,
                      cutoff_time = cutoff_times_,
                      where_primitives = ['max','min','sum','std'],
                      max_depth=10, features_only=False)
output:
/home/nitin/miniconda3/envs/featuretools/lib/python3.9/site-packages/featuretools/synthesis/dfs.py:307: UnusedPrimitiveWarning: Some specified primitives were not used during DFS:
  where_primitives: ['std']
This may be caused by a using a value of max_depth that is too small, not setting interesting values, or it may indicate no compatible variable types for the primitive were found in the data.
  warnings.warn(warning_msg, UnusedPrimitiveWarning)
Featuretools version: 0.25.0 Featuretools installation directory: /home/nitin/miniconda3/envs/featuretools/lib/python3.9/site-packages/featuretools

SYSTEM INFO

python: 3.9.4.final.0 python-bits: 64 OS: Linux OS-release: 5.4.0-74-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_IN LOCALE: en_IN.ISO8859-1

INSTALLED VERSIONS

numpy: 1.20.3 pandas: 1.2.4 tqdm: 4.61.1 PyYAML: 5.4.1 cloudpickle: 1.6.0 dask: 2021.6.0 distributed: 2021.6.0 psutil: 5.8.0 pip: 21.1.2 setuptools: 49.6.0.post20210108

nitinmnsn avatar Jun 30 '21 06:06 nitinmnsn