AutoML_Alex icon indicating copy to clipboard operation
AutoML_Alex copied to clipboard

State-of-the art Automated Machine Learning python library for Tabular Data

AutoML Alex

Downloads PyPI - Python Version PyPI CodeFactor Telegram License


State-of-the art Automated Machine Learning python library for Tabular Data

Works with Tasks:

  • [x] Binary Classification

  • [x] Regression

  • [ ] Multiclass Classification (in progress...)

Benchmark Results

bench

The bigger, the better
From AutoML-Benchmark

Scheme

scheme

Features

  • Automated Data Clean (Auto Clean)
  • Automated Feature Engineering (Auto FE)
  • Smart Hyperparameter Optimization (HPO)
  • Feature Generation
  • Feature Selection
  • Models Selection
  • Cross Validation
  • Optimization Timelimit and EarlyStoping
  • Save and Load (Predict new data)

Installation

pip install automl-alex

Docs

DocPage

🚀 Examples

Classifier:

from automl_alex import AutoMLClassifier

model = AutoMLClassifier()
model.fit(X_train, y_train, timeout=600)
predicts = model.predict(X_test)

Regression:

from automl_alex import AutoMLRegressor

model = AutoMLRegressor()
model.fit(X_train, y_train, timeout=600)
predicts = model.predict(X_test)

DataPrepare:

from automl_alex import DataPrepare

de = DataPrepare()
X_train = de.fit_transform(X_train)
X_test = de.transform(X_test)

Simple Models Wrapper:

from automl_alex import LightGBMClassifier

model = LightGBMClassifier()
model.fit(X_train, y_train)
predicts = model.predict_proba(X_test)

model.opt(X_train, y_train,
    timeout=600, # optimization time in seconds,
    )
predicts = model.predict_proba(X_test)

More examples in the folder ./examples:

What's inside

It integrates many popular frameworks:

  • scikit-learn
  • XGBoost
  • LightGBM
  • CatBoost
  • Optuna
  • ...

Works with Features

  • [x] Categorical Features

  • [x] Numerical Features

  • [x] Binary Features

  • [ ] Text

  • [ ] Datetime

  • [ ] Timeseries

  • [ ] Image

Note

  • With a large dataset, a lot of memory is required! Library creates many new features. If you have a large dataset with a large number of features (more than 100), you may need a lot of memory.

Realtime Dashboard

Works with optuna-dashboard

Dashboard

Run

$ optuna-dashboard sqlite:///db.sqlite3

Road Map

  • [x] Feature Generation

  • [x] Save/Load and Predict on New Samples

  • [x] Advanced Logging

  • [x] Add opt Pruners

  • [x] Docs Site

  • [ ] DL Encoders

  • [ ] Add More libs (NNs)

  • [ ] Multiclass Classification

  • [ ] Build pipelines

Contact

Telegram Group