tutorial-multi-gpu
tutorial-multi-gpu copied to clipboard
Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial
ISC24 Tutorial: Efficient Distributed GPU Programming for Exascale
Repository with talks and exercises of our Efficient GPU Programming for Exascale tutorial, to be held at ISC24.
Coordinates
- Date: 12 May 2024
- Occasion: ISC24 Tutorial
- Tutors: Simon Garcia de Gonzalo (SNL), Andreas Herten (JSC), Markus Hrywniak (NVIDIA), Jiri Kraus (NVIDIA), Lena Oden (Uni Hagen)
Setup
The tutorial is an interactive tutorial with introducing lectures and practical exercises to apply knowledge. The exercises have been derived from the Jacobi solver implementations available in NVIDIA/multi-gpu-programming-models.
Walk-through:
- Sign up at JuDoor: https://go.fzj.de/mg-jd
- Open Jupyter JSC: https://jupyter-jsc.fz-juelich.de
- Create new Jupyter instance on JUWELS, using training2414 account, on LoginNodeBooster
- Source course environment:
source $PROJECT_training2414/env.sh - Sync material:
jsc-material-sync - Locally install NVIDIA Nsight Systems: https://developer.nvidia.com/nsight-systems
Curriculum:
- Lecture: Tutorial Overview, Introduction to System + Onboarding Andreas
- Lecture: MPI-Distributed Computing with GPUs Simon
- Hands-on: Multi-GPU Parallelization
- Lecture: Performance / Debugging Tools Markus
- Lecture: Optimization Techniques for Multi-GPU Applications Simon
- Hands-on: Overlap Communication and Computation with MPI
- Lecture: Overview of NCCL and NVSHMEN in MPI Lena
- Hands-on: Using NCCL and NVSHMEM
- Lecture: Device-initiated Communication with NVSHMEM Jiri
- Hands-on: Using Device-Initiated Communication with NVSHMEM
- Lecture: Conclusion and Outline of Advanced Topics Andreas