evalml issues

2478 datachecks for unknown types

1

### Pull Request Description Added a datacheck that checks if the amount of unknowns per dataset isn't more than 50%. Closes #2478 ----- *After creating the pull request: in order...

MichaelFu512

Support Woodwork v0.18.0

gsheni

Transition EvalML to use pyproject.toml only (move away from setup.cfg)

- pyproject.toml is the future of python package metadata and tool config - "One file to rule them all" - Examples - https://github.com/alteryx/featuretools/issues/2261 - https://github.com/alteryx/woodwork/pull/1506

gsheni

Multi-table support for featuretools component

1

Extension of issue [470](https://github.com/alteryx/evalml/issues/470). PR [1454](https://github.com/alteryx/evalml/pull/1454) addresses adding the FeatureTools component, but only handles single dataframes/datatables. In order to use FeatureTools fully, we want to be able to use it...

bchen1116

new feature

needs design

spike

Model debugging: Add ability to compute and store graphs/stats on each CV fold during automl search

2

**Goal** If users intend to compute graphs and stats on any of the models trained during each CV fold, we should design an API which allows them to do so....

dsherry

new feature

needs design

Sparse matrix support

Many of our components can support passing [sparse matrices](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html), which is critical for large sparse datasets. Note in addition to figuring out which components and pipelines could support this without...

dsherry

new feature

needs design

performance

Add support for GPU acceleration

2

In the usability blitz, @christopherbunn showed [what looked like an 8x speedup in wall-clock runtime](https://alteryx.quip.com/gwW9AQg5m0Nq/Evalml-Usability-Blitz-April-2020#PFNACAoq60N) when GPU support was enabled in our catboost component. So yes, GPUs are awesome :)...

dsherry

new feature

needs design

Search iteration plot can get skewed if any scores are outliers

This came up [in the usability blitz](https://alteryx.quip.com/gwW9AQg5m0Nq/Evalml-Usability-Blitz-April-2020#PFNACAJQ4nk). If an initial model has a poor score, the plot scale can make it totally unreadable.

dsherry

enhancement

RandomSearch/GridSearch tuners: if search space exhausted for one pipeline type, entire search stops

**Problem** #230 added `RandomSearch` and `GridSearch` tuners. Unlike the `SKOptTuner`, those tuners have potentially finite search spaces, and can eventually run out of parameters to suggest (particularly `GridSearch`). If automl...

dsherry

bug

Support for more generic feature selectors

1

Currently, we only have SelectFromModel. It would be nice to support some feature selectors (ex: SelectKBest, SelectPercentile) that don't rely on an estimator and instead simply select features using statistical...

angela97lin

enhancement

needs design

new component

evalml
evalml copied to clipboard

Metadata

2478 datachecks for unknown types

Support Woodwork v0.18.0

Transition EvalML to use pyproject.toml only (move away from setup.cfg)

Multi-table support for featuretools component

Model debugging: Add ability to compute and store graphs/stats on each CV fold during automl search

Sparse matrix support

Add support for GPU acceleration

Search iteration plot can get skewed if any scores are outliers

RandomSearch/GridSearch tuners: if search space exhausted for one pipeline type, entire search stops

Support for more generic feature selectors

← Metadata

Owner

Metadata

evalml evalml copied to clipboard

Metadata

← Metadata

Owner

Metadata

evalml
evalml copied to clipboard