Becca McBrayer

Results 16 issues of Becca McBrayer

Reduces the default `test_size` in `split_data` to be 0.1 instead of 0.2, as long as the new value is greater than the passed in forecast horizon.

Right now, we add the imputer to every pipeline that includes datatypes that the imputer supports. However, the imputer is only useful when there is missing data for the component...

With the merge of #3616, we allow cv fold test splits as small as a single data point for time series problems (if the forecast horizon is set to 1)....

bug

Should include a suitable baseline clustering pipeline. One option for a good baseline would be randomly splitting the data into two clusters, or n clusters if the user supplies the...

Follow up from PR #2905 where we temporarily disabled running `explain_predictions` and `explain_predictions_best_worst` with CatBoost models running the LIME algorithm (details in comment [here](https://github.com/alteryx/evalml/pull/2905#discussion_r733115415)). We should figure out how to...

enhancement

With the addition of the `SelectByType` transformer in #2531, it came up that having `SelectByType` inherit from `ColumnSelector` doesn't add much value at this point. However, it makes logical sense...

enhancement
refactor

The exploration of the performance of one of our perf test datasets in #2628 raised the notice that the dataset has too many dimensions when compared to the number of...

enhancement
new feature
needs design

Once the critical infrastructure for clustering is built but before all the changes are added to main , we need to make sure we have comprehensive documentation for users to...

documentation

This issue covers any necessary API updates, as well as documentation changes.

documentation
enhancement