evalml
evalml copied to clipboard
EvalML is an AutoML library written in python.
Follow-up from #1858 We should allow pipelines with repeating names to be passed in to `AutoMLSearch`. Some things to consider: 1. The tuners in `AutoMLAlgorithm` are keyed by pipeline name....
As we expand into the OS community and outside users contribute to EvalML, it would be nice to have a list of contributors for each release as Featuretools, Woodwork, and...
In #2815, we added integration tests for DataCheckActions + DataChecks. I think another potential use case is doing AutoMLSearch + model_understanding. We do something similar with LG so I'm open...
https://github.com/alteryx/evalml/pull/2846 added vowpal wabbit estimators to EvalML, but they are currently not used in AutoMLSearch. This would require performance testing and determining a good set of hyperparameter ranges for the...
In #2365, we noticed that some arima tests took about a minute to run despite only running on 500 rows. @ParthivNaresh thinks it may be because the starting set of...
```python import woodwork as ww import pandas as pd import numpy as np from evalml.pipelines.components import Imputer df = ww.DataTable(pd.DataFrame({ "all nan": [np.nan, np.nan, np.nan, np.nan, np.nan], "all nan cat":...
Currently, `AutoMLSearch` will only fit `OneHotEncoder`s with `top_n` [set to 10](https://github.com/alteryx/evalml/blob/main/evalml/pipelines/components/transformers/encoders/onehot_encoder.py#L25). This can be problematic because a user can have data with more than 10 categories, e.g. 50 US states,...
After we introduce action codes that indicate changes to automl config, we should update our methods to handle actions (`make_pipeline_from_data_check_output` and `make_pipeline_from_actions`) to handle automl configuration recommendations.
Per @kmax12's comment in #917, we could do a label leakage check after training a model, checking if it scored very highly but only has a single feature with all...