efficient-dl-systems icon indicating copy to clipboard operation
efficient-dl-systems copied to clipboard

Efficient Deep Learning Systems course materials (HSE, YSDA)

Efficient Deep Learning Systems

This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Science of HSE University and Yandex School of Data Analysis.

Syllabus

  • Week 1: Introduction
    • Lecture: Course overview and organizational details. Core concepts of the GPU architecture and CUDA API.
    • Seminar: CUDA operations in PyTorch. Introduction to benchmarking.
  • Week 2: Basics of distributed ML
    • Lecture: Introduction to distributed training. Process-based communication. Parameter Server architecture.
    • Seminar: Multiprocessing basics. Parallel GloVe training.
  • Week 3: Data-parallel training and All-Reduce
    • Lecture: Data-parallel training of neural networks. All-Reduce and its efficient implementations.
    • Seminar: Introduction to PyTorch Distributed. Data-parallel training primitives.
  • Week 4: Memory-efficient and model-parallel training
    • Lecture: Model-parallel training, gradient checkpointing, offloading
    • Seminar: Gradient checkpointing in practice
  • Week 5: Training optimizations, profiling DL code
    • Lecture: Mixed-precision training. Data storage and loading optimizations. Tools for profiling deep learning workloads
    • Seminar: Automatic Mixed Precision in PyTorch. Dynamic padding for sequence data and JPEG decoding benchmarks. Basics of PyTorch Profiler and cProfile.
  • Week 6: Python web application deployment
    • Lecture/Seminar: Building and deployment of production-ready web services. App & web servers, Docker containers, Prometheus metrics, API via HTTP and gRPC.
  • Week 7: Software for serving neural networks
    • Lecture/Seminar: Different formats for packing NN: ONNX, TorchScript, IR. Inference servers: OpenVINO, Triton. ML on client devices: TfJS, ML Kit, Core ML.
  • Week 8: Optimizing models for faster inference
    • Lecture: Knowlenge distillation, Pruning, Quantization, NAS, Efficient Architectures
    • Seminar: Quantization and distillation of Transformers
  • Week 9: Experiment tracking, model and data versioning, testing DL code in Python
    • Lecture: Experiment management basics and pipeline versioning. Configuring Python applications. Intro to regular and property-based testing.
    • Seminar: Example DVC+W&B project walkthrough. Intro to testing with pytest.
  • Week 10: Invited talks
    • Memory Footprint Reduction Techniques for DNN Training: An Overview. Gennady Pekhimenko, University of Toronto, Vector Institute
    • Efficient Inference of Deep Learning Models on (GP)GPU. Ivan Komarov, Yandex

Grading

There will be a total of 3 home assignments (some of them spread over several weeks). The final grade is a weighted sum of per-assignment grades. Please refer to the course page of your institution for details.

Staff