tensorflow-k8s-azure
tensorflow-k8s-azure copied to clipboard
Train TensorFlow Models at Scale with Kubernetes and Kubeflow on Azure
:warning: This repository is deprecated! Go to Azure/kubeflow-labs instead :warning:
Train TensorFlow Models at Scale with Kubernetes on Azure
Prerequisites
- Have a valid Microsoft Azure subscription allowing the creation of an ACS cluster
- Docker client installed: Installing Docker
- Azure-cli (2.0) installed: Installing the Azure CLI 2.0 | Microsoft Docs
- Git cli installed: Installing Git CLI
- Kubectl installed: Installing Kubectl
- Helm installed: Installing Helm CLI (Note: On Windows you can extract the
tarfile using a tool like 7Zip.
Clone this repository somewhere so you can easily access the different source files:
git clone https://github.com/wbuchwalter/tensorflow-k8s-azure
Content Summary
| Module | Description | |
|---|---|---|
| 0 | Introduction | Introduction to this workshop. Motivations and goals. |
| 1 | Docker | Docker and containers 101. |
| 2 | Kubernetes | Kubernetes important concepts overview. |
| 3 | Helm | Introduction to Helm |
| 4 | GPUs | How to use GPUs with Kubernetes. |
| 5 | TFJob | How to use tensorflow/k8s and TFJob to deploy a simple TensorFlow training. |
| 6 | Distributed Tensorflow | Going distributed with TFJob |
| 7 | Hyperparameters Sweep with Helm | Using Helm to deploy a large number of training testing different hypothesis, monitoring and comparing them. |
| 8 | Going Further | Links and resources to go further: Autoscaling, Distributed Storage. |
| 9 | Jupyter Notebooks | Easily deploy a Jupyter Notebook instance on Kubernetes. |