Becca McBrayer issues

Results 16 issues of


                                            Becca McBrayer

Conditionally include the imputer in pipelines

Closes #3603

Update split_data with new default for timeseries

Reduces the default `test_size` in `split_data` to be 0.1 instead of 0.2, as long as the new value is greater than the passed in forecast horizon.

Conditionally include the imputer in pipelines

Right now, we add the imputer to every pipeline that includes datatypes that the imputer supports. However, the imputer is only useful when there is missing data for the component...

Investigate core objectives for time series classification

With the merge of #3616, we allow cv fold test splits as small as a single data point for time series problems (if the forecast horizon is set to 1)....

bug

Add clustering pipelines

Should include a suitable baseline clustering pipeline. One option for a good baseline would be randomly splitting the data into two clusters, or n clusters if the user supplies the...

Enable CatBoost with LIME explain_predictions

Follow up from PR #2905 where we temporarily disabled running `explain_predictions` and `explain_predictions_best_worst` with CatBoost models running the LIME algorithm (details in comment [here](https://github.com/alteryx/evalml/pull/2905#discussion_r733115415)). We should figure out how to...

enhancement

Refactoring `ColumnSelector` and subclasses

With the addition of the `SelectByType` transformer in #2531, it came up that having `SelectByType` inherit from `ColumnSelector` doesn't add much value at this point. However, it makes logical sense...

enhancement

refactor

Add dimensionality reduction to AutoMLSearch

The exploration of the performance of one of our perf test datasets in #2628 raised the notice that the dataset has too many dimensions when compared to the number of...

enhancement

new feature

needs design

Add documentation for unsupervised V1

Once the critical infrastructure for clustering is built but before all the changes are added to main , we need to make sure we have comprehensive documentation for users to...

documentation

Update dimensionality reduction for easy visualization

This issue covers any necessary API updates, as well as documentation changes.

documentation

enhancement