ml-platform topic
skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 16+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
full-stack-on-prem-cv-mlops
"1 config, 1 command from Jupyter Notebook to serve Millions of users", Full-stack On-Premises MLOps system for Computer Vision from Data versioning to Model monitoring and drift detection.
runbooks
Finetune LLMs on K8s by using Runbooks
ai-ml-project-template
A "production-ready" simple project template to quickly start an Artificial Intelligence (AI), Machine Learning (ML) and/or Data Science (DS) project with basic files, branches and directory structure...
machine-learning-engineering
Welcome to the Machine Learning Engineering Repository, a comprehensive collection of resources, code, and insights to guide you through the exciting world of machine learning. This repository is des...
beta9
Run GPU Workloads Across Multiple Clouds
konduktor
cluster/scheduler health monitoring for GPU jobs on k8s