evalml icon indicating copy to clipboard operation
evalml copied to clipboard

EvalML is an AutoML library written in python.

Results 251 evalml issues
Sort by recently updated
recently updated
newest added

In #2667 , we added the ability of AutoMLSearch to create its own Dask LocalCluster to perform work in parallel. An important aspect of using a Dask LocalCluster is turning...

refactor

Preview of the error: ![image](https://user-images.githubusercontent.com/22552445/132767824-2af2d1c6-46ef-49f7-ab06-0e3392615ce0.png) We've noticed this for several of our perf testing datasets. One in particular was `regress.csv` Cloudwatch link [here](https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/$252Fecs$252Fevalml-test/log-events/ecs$252Fprod-evalml$252F9da7d5e7ccc44ca6a2fdd9c3943d1997) The warning being printed: ``` 2021-09-09T17:33:44.934-04:00 |...

bug
good first issue
performance

[Cloudwatch link](https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/$252Fecs$252Fevalml-test/log-events/ecs$252Fprod-evalml$252F9da7d5e7ccc44ca6a2fdd9c3943d1997) The warning being printed: ``` Objective did not converge. You might want to increase the number of iterations. Duality gap: 2.3056943635107245, tolerance: 0.021498518468096656 ``` Unclear which dataset this...

bug

## Reasoning It is not clear that AutoMLSearch is tuning the pipeline threshold during search. - It's not mentioned in our docs: https://evalml.alteryx.com/en/stable/user_guide/automl.html - It's not mentioned in the logs...

new feature

When running nyc_taxi.csv dataset through AutoML search (which has ~1.5M rows and 19 columns), it uses up more than 30 GB of memory when fitting an ensembling pipeline. This excessive...

performance

Related to [this discussion](https://github.com/alteryx/evalml/pull/3373#discussion_r832473688). With [this PR](https://github.com/alteryx/evalml/pull/3373) we have consolidated most of the pipeline parameter-creating logic into the `AutoMLAlgorithm` class. However, there are still cases, like with the sampler parameters,...

enhancement

Once the critical infrastructure for clustering is built but before all the changes are added to main , we need to make sure we have comprehensive documentation for users to...

documentation

In #3419 we added a fix to handle Email and URL features in default algorithm of such features by backtracking and looking at not only the feature provenance `EmailFeaturizer` and...

The `TimeSeriesRegularizer` currently sets the `window_length` to 5 and `threshold` to 0.8. This was done to support smaller datasets that wouldn't pass an `infer_frequency` check from WW with higher `window_length`...

As of sklearn version 0.22, ccp_alpha has been added as a pruning parameter for Decision Trees, Extra Trees, and Random Forests. Adding this as a hyperparameter would give AutoML an...

new feature
performance