awesome-mlops icon indicating copy to clipboard operation
awesome-mlops copied to clipboard

:sunglasses: A curated list of awesome MLOps tools

Awesome MLOps Awesome

A curated list of awesome MLOps tools.

Inspired by awesome-python.

  • Awesome MLOps
    • AutoML
    • CI/CD for Machine Learning
    • Cron Job Monitoring
    • Data Catalog
    • Data Enrichment
    • Data Exploration
    • Data Management
    • Data Processing
    • Data Validation
    • Data Visualization
    • Feature Engineering
    • Feature Store
    • Hyperparameter Tuning
    • Knowledge Sharing
    • Machine Learning Platform
    • Model Fairness and Privacy
    • Model Interpretability
    • Model Lifecycle
    • Model Serving
    • Model Testing & Validation
    • Optimization Tools
    • Simplification Tools
    • Visual Analysis and Debugging
    • Workflow Tools
  • Resources
    • Articles
    • Books
    • Events
    • Other Lists
    • Podcasts
    • Slack
    • Websites
  • Contributing


Tools for performing AutoML.

  • AutoGluon - Automates machine learning tasks enabling you to easily achieve strong predictive performance.
  • AutoKeras - AutoKeras goal is to make machine learning accessible for everyone.
  • AutoPyTorch - Automatic architecture search and hyperparameter optimization for PyTorch.
  • AutoSKLearn - Automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
  • EvalML - A library that builds, optimizes, and evaluates ML pipelines using domain-specific functions.
  • FLAML - Finds accurate ML models automatically, efficiently and economically.
  • H2O AutoML - Automates ML workflow, which includes automatic training and tuning of models.
  • MindsDB - AI layer for databases that allows you to effortlessly develop, train and deploy ML models.
  • MLBox - MLBox is a powerful Automated Machine Learning python library.
  • Model Search - Framework that implements AutoML algorithms for model architecture search at scale.
  • NNI - An open source AutoML toolkit for automate machine learning lifecycle.

CI/CD for Machine Learning

Tools for performing CI/CD for Machine Learning.

  • ClearML - Auto-Magical CI/CD to streamline your ML workflow.
  • CML - Open-source library for implementing CI/CD in machine learning projects.

Cron Job Monitoring

Tools for monitoring cron jobs (recurring jobs).

  • Cronitor - Monitor any cron job or scheduled task.
  • HealthchecksIO - Simple and effective cron job monitoring.

Data Catalog

Tools for data cataloging.

  • Amundsen - Data discovery and metadata engine for improving the productivity when interacting with data.
  • Apache Atlas - Provides open metadata management and governance capabilities to build a data catalog.
  • CKAN - Open-source DMS (data management system) for powering data hubs and data portals.
  • DataHub - LinkedIn's generalized metadata search & discovery tool.
  • Magda - A federated, open-source data catalog for all your big data and small data.
  • Metacat - Unified metadata exploration API service for Hive, RDS, Teradata, Redshift, S3 and Cassandra.
  • OpenMetadata - A Single place to discover, collaborate and get your data right.

Data Enrichment

Tools and libraries for data enrichment.

  • Upgini - Enriches training datasets with features from public and community shared data sources.

Data Exploration

Tools for performing data exploration.

  • Apache Zeppelin - Enables data-driven, interactive data analytics and collaborative documents.
  • BambooLib - An intuitive GUI for Pandas DataFrames.
  • Google Colab - Hosted Jupyter notebook service that requires no setup to use.
  • Jupyter Notebook - Web-based notebook environment for interactive computing.
  • JupyterLab - The next-generation user interface for Project Jupyter.
  • Jupytext - Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts.
  • Polynote - The polyglot notebook with first-class Scala support.

Data Management

Tools for performing data management.

  • Arrikto - Dead simple, ultra fast storage for the hybrid Kubernetes world.
  • BlazingSQL - A lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
  • Delta Lake - Storage layer that brings scalable, ACID transactions to Apache Spark and other engines.
  • Dolt - SQL database that you can fork, clone, branch, merge, push and pull just like a git repository.
  • Dud - A lightweight CLI tool for versioning data alongside source code and building data pipelines.
  • DVC - Management and versioning of datasets and machine learning models.
  • Git LFS - An open source Git extension for versioning large files.
  • Hub - A dataset format for creating, storing, and collaborating on AI datasets of any size.
  • Intake - A lightweight set of tools for loading and sharing data in data science projects.
  • lakeFS - Repeatable, atomic and versioned data lake on top of object storage.
  • Marquez - Collect, aggregate, and visualize a data ecosystem's metadata.
  • Milvus - An open source embedding vector similarity search engine powered by Faiss, NMSLIB and Annoy.
  • Pinecone - Managed and distributed vector similarity search used with a lightweight SDK.
  • Qdrant - An open source vector similarity search engine with extended filtering support.
  • Quilt - A self-organizing data hub with S3 support.

Data Processing

Tools related to data processing and data pipelines.

  • Airflow - Platform to programmatically author, schedule, and monitor workflows.
  • Azkaban - Batch workflow job scheduler created at LinkedIn to run Hadoop jobs.
  • Dagster - A data orchestrator for machine learning, analytics, and ETL.
  • Hadoop - Framework that allows for the distributed processing of large data sets across clusters.
  • Spark - Unified analytics engine for large-scale data processing.

Data Validation

Tools related to data validation.

  • Cerberus - Lightweight, extensible data validation library for Python.
  • Great Expectations - A Python data validation framework that allows to test your data against datasets.
  • JSON Schema - A vocabulary that allows you to annotate and validate JSON documents.
  • TFDV - An library for exploring and validating machine learning data.

Data Visualization

Tools for data visualization, reports and dashboards.

  • Count - SQL/drag-and-drop querying and visualisation tool based on notebooks.
  • Dash - Analytical Web Apps for Python, R, Julia, and Jupyter.
  • Data Studio - Reporting solution for power users who want to go beyond the data and dashboards of GA.
  • Facets - Visualizations for understanding and analyzing machine learning datasets.
  • Lux - Fast and easy data exploration by automating the visualization and data analysis process.
  • Metabase - The simplest, fastest way to get business intelligence and analytics to everyone.
  • Redash - Connect to any data source, easily visualize, dashboard and share your data.
  • Superset - Modern, enterprise-ready business intelligence web application.
  • Tableau - Powerful and fastest growing data visualization tool used in the business intelligence industry.

Feature Engineering

Tools and libraries related to feature engineering.

  • Feature Engine - Feature engineering package with SKlearn like functionality.
  • Featuretools - Python library for automated feature engineering.
  • TSFresh - Python library for automatic extraction of relevant features from time series.

Feature Store

Feature store tools for data serving.

  • Butterfree - A tool for building feature stores. Transform your raw data into beautiful features.
  • ByteHub - An easy-to-use feature store. Optimized for time-series data.
  • Feast - End-to-end open source feature store for machine learning.
  • Feathr - An enterprise-grade, high performance feature store.
  • Tecton - A fully-managed feature platform built to orchestrate the complete lifecycle of features.

Hyperparameter Tuning

Tools and libraries to perform hyperparameter tuning.

  • Advisor - Open-source implementation of Google Vizier for hyper parameters tuning.
  • Hyperas - A very simple wrapper for convenient hyperparameter optimization.
  • Hyperopt - Distributed Asynchronous Hyperparameter Optimization in Python.
  • Katib - Kubernetes-based system for hyperparameter tuning and neural architecture search.
  • KerasTuner - Easy-to-use, scalable hyperparameter optimization framework.
  • Optuna - Open source hyperparameter optimization framework to automate hyperparameter search.
  • Scikit Optimize - Simple and efficient library to minimize expensive and noisy black-box functions.
  • Talos - Hyperparameter Optimization for TensorFlow, Keras and PyTorch.
  • Tune - Python library for experiment execution and hyperparameter tuning at any scale.

Knowledge Sharing

Tools for sharing knowledge to the entire team/company.

  • Knowledge Repo - Knowledge sharing platform for data scientists and other technical professions.
  • Kyso - One place for data insights so your entire team can learn from your data.

Machine Learning Platform

Complete machine learning platform solutions.

  • aiWARE - aiWARE helps MLOps teams evaluate, deploy, integrate, scale & monitor ML models.
  • Algorithmia - Securely govern your machine learning operations with a healthy ML lifecycle.
  • Allegro AI - Transform ML/DL research into products. Faster.
  • Bodywork - Deploys machine learning projects developed in Python, to Kubernetes.
  • CNVRG - An end-to-end machine learning platform to build and deploy AI models at scale.
  • DAGsHub - A platform built on open source tools for data, model and pipeline management.
  • Dataiku - Platform democratizing access to data and enabling enterprises to build their own path to AI.
  • DataRobot - AI platform that democratizes data science and automates the end-to-end ML at scale.
  • Domino - One place for your data science tools, apps, results, models, and knowledge.
  • Edge Impulse - Platform for creating, optimizing, and deploying AI/ML algorithms for edge devices.
  • envd - Machine learning development environment for data science and AI/ML engineering teams.
  • FedML - Simplifies the workflow of federated learning anywhere at any scale.
  • Gradient - Multicloud CI/CD and MLOps platform for machine learning teams.
  • H2O - Open source leader in AI with a mission to democratize AI for everyone.
  • Hopsworks - Open-source platform for developing and operating machine learning models at scale.
  • Iguazio - Data science platform that automates MLOps with end-to-end machine learning pipelines.
  • Katonic - Automate your cycle of intelligence with Katonic MLOps Platform.
  • Knime - Create and productionize data science using one easy and intuitive environment.
  • Kubeflow - Making deployments of ML workflows on Kubernetes simple, portable and scalable.
  • LynxKite - A complete graph data science platform for very large graphs and other datasets.
  • ML Workspace - All-in-one web-based IDE specialized for machine learning and data science.
  • MLReef - Open source MLOps platform that helps you collaborate, reproduce and share your ML work.
  • Modzy - AI platform and marketplace offering scalable, secure, and ready-to-deploy AI models.
  • - MLOps platform that integrates open-source and proprietary tools into client-oriented systems.
  • Pachyderm - Combines data lineage with end-to-end pipelines on Kubernetes, engineered for the enterprise.
  • Polyaxon - A platform for reproducible and scalable machine learning and deep learning on kubernetes.
  • Sagemaker - Fully managed service that provides the ability to build, train, and deploy ML models quickly.
  • SigOpt - A platform that makes it easy to track runs, visualize training, and scale hyperparameter tuning.
  • Valohai - Takes you from POC to production while managing the whole model lifecycle.

Model Fairness and Privacy

Tools for performing model fairness and privacy in production.

  • AIF360 - A comprehensive set of fairness metrics for datasets and machine learning models.
  • Fairlearn - A Python package to assess and improve fairness of machine learning models.
  • Opacus - A library that enables training PyTorch models with differential privacy.
  • TensorFlow Privacy - Library for training machine learning models with privacy for training data.

Model Interpretability

Tools for performing model interpretability/explainability.

  • Alibi - Open-source Python library enabling ML model inspection and interpretation.
  • Captum - Model interpretability and understanding library for PyTorch.
  • ELI5 - Python package which helps to debug machine learning classifiers and explain their predictions.
  • InterpretML - A toolkit to help understand models and enable responsible machine learning.
  • LIME - Explaining the predictions of any machine learning classifier.
  • Lucid - Collection of infrastructure and tools for research in neural network interpretability.
  • SAGE - For calculating global feature importance using Shapley values.
  • SHAP - A game theoretic approach to explain the output of any machine learning model.
  • Skater - Unified framework to enable Model Interpretation for all forms of model.

Model Lifecycle

Tools for managing model lifecycle (tracking experiments, parameters and metrics).

  • Aim - A super-easy way to record, search and compare 1000s of ML training runs.
  • Comet - Track your datasets, code changes, experimentation history, and models.
  • Guild AI - Open source experiment tracking, pipeline automation, and hyperparameter tuning.
  • Keepsake - Version control for machine learning with support to Amazon S3 and Google Cloud Storage.
  • Losswise - Makes it easy to track the progress of a machine learning project.
  • Mlflow - Open source platform for the machine learning lifecycle.
  • ModelDB - Open source ML model versioning, metadata, and experiment management.
  • Neptune AI - The most lightweight experiment management tool that fits any workflow.
  • Replicate - Library that uploads files and metadata (like hyperparameters) to S3 or GCS.
  • Sacred - A tool to help you configure, organize, log and reproduce experiments.
  • Weights and Biases - A tool for visualizing and tracking your machine learning experiments.

Model Serving

Tools for serving models in production.

  • Banana - Host your ML inference code on serverless GPUs and integrate it into your app with one line of code.
  • BentoML - Open-source platform for high-performance ML model serving.
  • BudgetML - Deploy a ML inference service on a budget in less than 10 lines of code.
  • Cortex - Machine learning model serving infrastructure.
  • Gradio - Create customizable UI components around your models.
  • GraphPipe - Machine learning model deployment made simple.
  • Hydrosphere - Platform for deploying your Machine Learning to production.
  • KFServing - Kubernetes custom resource definition for serving ML models on arbitrary frameworks.
  • Merlin - A platform for deploying and serving machine learning models.
  • MLEM - Version and deploy your ML models following GitOps principles.
  • Opyrator - Turns your ML code into microservices with web API, interactive GUI, and more.
  • PredictionIO - Event collection, deployment of algorithms, evaluation, querying predictive results via APIs.
  • Rune - Provides containers to encapsulate and deploy EdgeML pipelines and applications.
  • Seldon - Take your ML projects from POC to production with maximum efficiency and minimal risk.
  • Streamlit - Lets you create apps for your ML projects with deceptively simple Python scripts.
  • TensorFlow Serving - Flexible, high-performance serving system for ML models, designed for production.
  • TorchServe - A flexible and easy to use tool for serving PyTorch models.
  • Triton Inference Server - Provides an optimized cloud and edge inferencing solution.
  • Vespa - Store, search, organize and make machine-learned inferences over big data at serving time.

Model Testing & Validation

Tools for testing and validating models.

  • Deepchecks - Open-source package for validating ML models & data, with various checks and suites.

Optimization Tools

Optimization tools related to model scalability in production.

  • Accelerate - A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision.
  • Dask - Provides advanced parallelism for analytics, enabling performance at scale for the tools you love.
  • DeepSpeed - Deep learning optimization library that makes distributed training easy, efficient, and effective.
  • Fiber - Python distributed computing library for modern computer clusters.
  • Horovod - Distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
  • Mahout - Distributed linear algebra framework and mathematically expressive Scala DSL.
  • MLlib - Apache Spark's scalable machine learning library.
  • Modin - Speed up your Pandas workflows by changing a single line of code.
  • Nebulgym - Easy-to-use library to accelerate AI training.
  • Nebullvm - Easy-to-use library to boost AI inference.
  • Petastorm - Enables single machine or distributed training and evaluation of deep learning models.
  • Rapids - Gives the ability to execute end-to-end data science and analytics pipelines entirely on GPUs.
  • Ray - Fast and simple framework for building and running distributed applications.
  • Singa - Apache top level project, focusing on distributed training of DL and ML models.
  • Tpot - Automated ML tool that optimizes machine learning pipelines using genetic programming.

Simplification Tools

Tools related to machine learning simplification and standardization.

  • Hermione - Help Data Scientists on setting up more organized codes, in a quicker and simpler way.
  • Hydra - A framework for elegantly configuring complex applications.
  • Koalas - Pandas API on Apache Spark. Makes data scientists more productive when interacting with big data.
  • Ludwig - Allows users to train and test deep learning models without the need to write code.
  • MLNotify - No need to keep checking your training, just one import line and you'll know the second it's done.
  • PyCaret - Open source, low-code machine learning library in Python.
  • Sagify - A CLI utility to train and deploy ML/DL models on AWS SageMaker.
  • Soopervisor - Export ML projects to Kubernetes (Argo workflows), Airflow, AWS Batch, and SLURM.
  • Soorgeon - Convert monolithic Jupyter notebooks into maintainable pipelines.
  • TrainGenerator - A web app to generate template code for machine learning.
  • Turi Create - Simplifies the development of custom machine learning models.

Visual Analysis and Debugging

Tools for performing visual analysis and debugging of ML/DL models.

  • Aporia - Observability with customized monitoring and explainability for ML models.
  • Arize - An end-to-end ML observability and model monitoring platform.
  • Evidently - Interactive reports to analyze ML models during validation or production monitoring.
  • Fiddler - Monitor, explain, and analyze your AI in production.
  • Manifold - A model-agnostic visual debugging tool for machine learning.
  • NannyML - Algorithm capable of fully capturing the impact of data drift on performance.
  • Netron - Visualizer for neural network, deep learning, and machine learning models.
  • Superwise - Fully automated, enterprise-grade model observability in a self-service SaaS platform.
  • Whylogs - The open source standard for data logging. Enables ML monitoring and observability.
  • Yellowbrick - Visual analysis and diagnostic tools to facilitate machine learning model selection.

Workflow Tools

Tools and frameworks to create workflows or pipelines in the machine learning context.

  • Argo - Open source container-native workflow engine for orchestrating parallel jobs on Kubernetes.
  • Automate Studio - Rapidly build & deploy AI-powered workflows.
  • Couler - Unified interface for constructing and managing workflows on different workflow engines.
  • dstack - An open-core tool to automate data and training workflows.
  • Flyte - Easy to create concurrent, scalable, and maintainable workflows for machine learning.
  • Kale - Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows.
  • Kedro - Library that implements software engineering best-practice for data and ML pipelines.
  • Luigi - Python module that helps you build complex pipelines of batch jobs.
  • Metaflow - Human-friendly lib that helps scientists and engineers build and manage data science projects.
  • MLRun - Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines.
  • Orchest - Visual pipeline editor and workflow orchestrator with an easy to use UI and based on Kubernetes.
  • Ploomber - Write maintainable, production-ready pipelines. Develop locally, deploy to the cloud.
  • Prefect - A workflow management system, designed for modern infrastructure.
  • ZenML - An extensible open-source MLOps framework to create reproducible pipelines.


Where to discover new tools and discuss about existing ones.




Other Lists





All contributions are welcome! Please take a look at the contribution guidelines first.