evalml
evalml copied to clipboard
TimeSeriesImputer should not allow interpolate as strategy for boolean or categorical targets
Currently the target_impute_strategy
is applied to any kind of target data, independent of whether or not the strategy makes sense for that kind of data. This is only problematic for the interpolate
strategy, as the other two can be used with any data.
Interpolate, however, should only be used with numeric values. Data with the category
dtype will raise an error from pandas, and data with boolean values, with the nullalble type handling, will become Double
with floating point values imputed, which doesn't make sense (this was actually happening prior to the nulalble type handling as well).
We should consider either not allowing interpolate (in which case we could remeove y
from the _integer_nullable_incompatibilities) to be used for non numeric data or using one other other interpolate methods listed https://pandas.pydata.org/docs/reference/api/pandas.Series.interpolate.html.
This will not be seen in AutoML search, because we use the default impute strategies in _make_component_list_from_actions
, so interpolate will not be the target_strategy
.