evalml issues

DaskEngine: Internal vs. External Clusters

1

In #2667 , we added the ability of AutoMLSearch to create its own Dask LocalCluster to perform work in parallel. An important aspect of using a Dask LocalCluster is turning...

chukarsten

refactor

Linear model throws convergence error on city_rev.csv

2

Preview of the error: ![image](https://user-images.githubusercontent.com/22552445/132767824-2af2d1c6-46ef-49f7-ab06-0e3392615ce0.png) We've noticed this for several of our perf testing datasets. One in particular was `regress.csv` Cloudwatch link [here](https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/$252Fecs$252Fevalml-test/log-events/ecs$252Fprod-evalml$252F9da7d5e7ccc44ca6a2fdd9c3943d1997) The warning being printed: ``` 2021-09-09T17:33:44.934-04:00 |...

bchen1116

bug

good first issue

performance

Warning in automl: "Objective did not converge"

[Cloudwatch link](https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/$252Fecs$252Fevalml-test/log-events/ecs$252Fprod-evalml$252F9da7d5e7ccc44ca6a2fdd9c3943d1997) The warning being printed: ``` Objective did not converge. You might want to increase the number of iterations. Duality gap: 2.3056943635107245, tolerance: 0.021498518468096656 ``` Unclear which dataset this...

dsherry

bug

Design - TuningComponent

## Reasoning It is not clear that AutoMLSearch is tuning the pipeline threshold during search. - It's not mentioned in our docs: https://evalml.alteryx.com/en/stable/user_guide/automl.html - It's not mentioned in the logs...

freddyaboulton

new feature

Large datasets consumes excessive amounts of memory when fitting ensemble pipelines

When running nyc_taxi.csv dataset through AutoML search (which has ~1.5M rows and 19 columns), it uses up more than 30 GB of memory when fitting an ensembling pipeline. This excessive...

christopherbunn

performance

Move pipeline parameter logic into AutoMLAlgorithm

Related to [this discussion](https://github.com/alteryx/evalml/pull/3373#discussion_r832473688). With [this PR](https://github.com/alteryx/evalml/pull/3373) we have consolidated most of the pipeline parameter-creating logic into the `AutoMLAlgorithm` class. However, there are still cases, like with the sampler parameters,...

bchen1116

enhancement

Add documentation for unsupervised V1

Once the critical infrastructure for clustering is built but before all the changes are added to main , we need to make sure we have comprehensive documentation for users to...

eccabay

documentation

ParthivNaresh

new feature

performance

evalml
evalml copied to clipboard

Metadata

DaskEngine: Internal vs. External Clusters

Linear model throws convergence error on city_rev.csv

Warning in automl: "Objective did not converge"

Design - TuningComponent

Large datasets consumes excessive amounts of memory when fitting ensemble pipelines

Move pipeline parameter logic into AutoMLAlgorithm

Add documentation for unsupervised V1

Generalize handling interaction between feature selectors and categorical features

Add smart function to dynamically set `window_length` and `threshold` for `TimeSeriesRegularizer`

Add ccp_alpha for pruning to Tree based estimators

← Metadata

Owner

Metadata

evalml evalml copied to clipboard

Metadata

← Metadata

Owner

Metadata

evalml
evalml copied to clipboard