evalml issues

Add util method to component graph to get components based on component class

Right now, `component_graph.get_component` expects a string which is the unique name used to find a component in the graph (ex: "My Label Encoder", and not "Label Encoder"). This makes it...

angela97lin

new feature

good first issue

Ability to use more meta-learner models for stacked ensembles

3

We currently use logistic reg for classification and linear reg for regression. I bet lasso would perform better!

dsherry

new feature

performance

Select pipelines to pass to ensembling based on correlation of the residual

Overfitting protection @rpeck

dsherry

enhancement

performance

Warm Start for Ensembles

This will involve a larger discussion on how we want to integrate this and what form the warm-start feature should take. Currently I see three implementations that we can consider...

ParthivNaresh

new feature

needs design

performance

spike

Create DataCheck for Unknown types

2

This issue was brought up [here](https://alteryx.atlassian.net/wiki/spaces/PS/pages/643268717/Handle+Unknown+types+from+Woodwork+in+EvalML?focusedCommentId=653230126#comment-653230126) as we integrate the new WW update into EvalML. Primarily, we want to raise a datacheck warning/error when the dataset a user passed in...

bchen1116

enhancement

spike

Only run prophet unit tests in git-test-prophet command

Our Makefile lists the following for the `git-test-prophet` command: ``` pytest evalml/tests/component_tests/test_prophet_regressor.py evalml/tests/component_tests/test_components.py evalml/tests/component_tests/test_utils.py evalml/tests/pipeline_tests/ evalml/tests/utils_tests/ ``` So we run the prophet unit tests in `component_tests/test_prophet_regressor` but we also run...

freddyaboulton

refactor

testing

tech debt

Cleanup/Issue Filing for TODOs

1

Just a placeholder issue for generating issues for the TODO's in our code. Can turn this into an epic as necessary. Successful completion of this issue/epic is building or associating...

chukarsten

task

spike

tech debt

Estimator: Hellinger distance decision trees (HDDT)

Good at handling imbalanced data: https://www3.nd.edu/~nchawla/papers/DMKD11.pdf I couldn't find a widely-used python impl. I did find [this](https://github.com/EvgeniDubov/hellinger-distance-criterion#example), looks like there's some cython going on. The same author wrote [this nice...

dsherry

new feature

needs design

performance

new component

Add model selection split for automl search

Rather than relying on the CV scores to rank the pipelines on the leaderboard, perhaps we should have a model selection split where we hold out some data and rank...

angela97lin

new feature

needs design

spike

Enable CatBoost with LIME explain_predictions

Follow up from PR #2905 where we temporarily disabled running `explain_predictions` and `explain_predictions_best_worst` with CatBoost models running the LIME algorithm (details in comment [here](https://github.com/alteryx/evalml/pull/2905#discussion_r733115415)). We should figure out how to...

eccabay

enhancement

evalml
evalml copied to clipboard

Metadata

Add util method to component graph to get components based on component class

Ability to use more meta-learner models for stacked ensembles

Select pipelines to pass to ensembling based on correlation of the residual

Warm Start for Ensembles

Create DataCheck for Unknown types

Only run prophet unit tests in git-test-prophet command

Cleanup/Issue Filing for TODOs

Estimator: Hellinger distance decision trees (HDDT)

Add model selection split for automl search

Enable CatBoost with LIME explain_predictions

← Metadata

Owner

Metadata

evalml evalml copied to clipboard

Metadata

← Metadata

Owner

Metadata

evalml
evalml copied to clipboard