xam icon indicating copy to clipboard operation
xam copied to clipboard

:dart: Personal data science and machine learning toolbox

xam Build Status

xam is my personal data science and machine learning toolbox. It is written in Python 3 and stands on the shoulders of giants (mainly pandas and scikit-learn). It loosely follows scikit-learn's fit/transform/predict convention.

Installation

:warning: Because xam is a personal toolkit, the --upgrade flag will install the latest releases of each dependency (scipy, pandas etc.). I like to stay up-to-date with the latest library versions.

Table of contents

Usage example is available in the docs folder. Each example is tested with doctest.

  • Ensembling
    • Groupby model
    • LightGBM with CV
    • Stacking
    • Stacking with bagged test predictions
  • Exploratory data analysis (EDA)
    • Feature importance
  • Feature extraction
    • Bayesian target encoding
    • Combining features
    • Count encoding
    • Cyclic features
  • Feature selection
    • Forward-backward selection
  • Linear models
    • AUC regressor
  • Model selection
    • Ordered cross-validation
  • Natural Language Processing (NLP)
    • NB-SVM
    • Norvig spelling corrector
    • Top-terms classifier
  • Pipeline
    • Column selection
    • Series transformer
    • DataFrame transformer
    • Lambda transformer
  • Plotting
    • Latex style figures
  • Preprocessing
    • Binning
    • Groupby transformer
    • One-hot encoding
    • Resampling
  • Time series analysis (TSA)
    • Exponentially weighted average
    • Exponential smoothing
    • Frequency average forecasting
  • Various
    • Datetime range
    • Next day of the week
    • Subsequence lengths
    • DataFrame to Vowpal Wabbit
    • Normalized compression distance
    • Skyline querying
    • Fuzzy duplicates

Other Python data science and machine learning toolkits

License

The MIT License (MIT). Please see the license file for more information.