evalml
evalml copied to clipboard
EvalML is an AutoML library written in python.
We saw `test_score_pipelines_passes_X_train_y_train[multiclass-cf_threaded]` flake on the linux nightlies on 2021-10-27. Logs are [here](https://github.com/alteryx/evalml/runs/4018971157?check_suite_focus=true) We've seen this flake a couple of times (@angela97lin saw it fail on her PR once) so...
Going back to https://github.com/alteryx/evalml/issues/1400, now that we can pickle pipelines, we should explore trying to use BentoML (or other open source libraries) to deploy pickled pipelines. Would be a great...
In #2243, we added `X_schema` and `y_schema` arguments to `train_pipeline` and `score_pipeline` because the dask engine would lose the schema once it pickles the dataframe and sends it to the...
Repro: ```python from evalml.demos import load_fraud from evalml.pipelines import BinaryClassificationPipeline from evalml.model_understanding import explain_predictions_best_worst import json class Fraud(BinaryClassificationPipeline): custom_name = "Fraud Pipeline" component_graph = ["Imputer", "DateTime Featurization Component", "Text Featurization...
Post the epic to include recommended actions, we should address adding actions to the ClassImbalanceDataCheck to override AutoML's default behavior for handling data splitting.
Following #1515 and https://alteryx.quip.com/CSE4AuFmVZbU/Multicollinearity-Data-Check-v1, it'd be nice to find a data set with multicollinear columns and show the value of the data check in action :)
We should add a data check action code (`UPDATE_AUTOML_CONFIG`) that indicates recommending changes to automl config settings. A good place to use this could be in the `ClassImbalanceDataCheck`, where we...
Running `automl.search()` fails when the target column is an `IntegerNullable` logical type. This was discovered on a time series regression problem, but have not tested to see if this same...
I just put the problem_ Type="binary" becomes "multiclass" #### ***************************** * Beginning pipeline search * ***************************** Optimizing for Log Loss Multiclass. Lower score is better. Using SequentialEngine to train and...
Currently, the `PolynomialDecomposer` calls `freq_to_period` based on the time series frequency string stored in the `DateTimeIndex` of the target variable. This returns a periodicity integer which may not always reflect...