trafficstars

MS-HPC-AI-GPU

Resources pour le cours d'introduction à la programmation des GPUs du mastère spécialisé HPC-AI

biblio

Scientific Programming and Computer Architecture

Scientific Programming and Computer Architecture book by Divakar Viswanath
https://github.com/divakarvi/bk-spca : source associated to the book
Definition of latency oriented architecture
Seven Dwarfs of HPC
CSCI-5576/4576: High Performance Scientific Computing
Mark Horowitz talk at ISSCC_2014: Computing's energy problem
Introduction to High-Performance Scientific Computing, book and slides by Victor Eijkhout and The Art of HPC website
San Diego Summer institute
Finnish CSC summer school
Computational Physics book by K. N. Anagnostopoulos
Modern computer architecture slides, see e.g. slides Intro_Architecture.pdf
Structure and Interpretation of Computer Programs

CUDA / GPU training

NVIDIA's latest CUDA programming guide
Julich training on CUDA
Oxford training on CUDA
Swiss CSCS summer school
Amina Guermouche (Telecom Paris)
EPCC, Univ Edinburgh, GPU training
ARCHER GPU course
Univ Luxembourg HPC
SC19 Introduction to GPU programming with CUDA
https://codingbyexample.com/category/cuda/
http://turing.une.edu.au/~cosc330/lectures/display_notes.php?lecture=18
https://www.nersc.gov/users/training/gpus-for-science/
https://dl.acm.org/citation.cfm?id=3318192
[email protected]:hwuligans/gputeachingkit-labs.git
http://syllabus.gputeachingkit.com/
udemy/cuda-programming-masterclass
SDL2 Graphics User Interface : https://github.com/rogerallen/smandelbrotr
mgbench : a multi-GPU benchmark
performance analysis : parallelforall blog on Nsight
misc : convert CUDA to portable C++ for AMD GPU
List of Nvidia GPUs
https://github.com/ashokyannam/GPU_Acceleration_Using_CUDA_C_CPP
https://github.com/karlrupp/cpu-gpu-mic-comparison
https://perso.centrale-marseille.fr/~gchiavassa/visible/HPC/01%20-%20GR%20%20Intro%20to%20GPU%20programming%20V2%20OpenACC%20.pdf

CUDA / performance analysis

https://devblogs.nvidia.com/using-nsight-compute-to-inspect-your-kernels/
https://www.olcf.ornl.gov/wp-content/uploads/2019/08/NVIDIA-Profilers.pdf
http://on-demand.gputechconf.com/gtc/2017/presentation/s7445-jakob-progsch-what-the-profiler-is-telling-you.pdf
monitoring performance : https://github.com/NERSC/timemory
roofline model

Other CUDA resources

C++ wrapper library
template CMake project for CUDA
Multi-GPU programming from FZJ
Multi-GPU programming from Nvidia
CUDA Library samples (cuFFT, cuSolver , cuSparse, ...)
MatX, a GPU-Accelerated Numerical Computing C++ library

CUDA / python

(NEW 2021) legate and cuNumeric
cuNumeric: drop-in remplacement for Numpy, built on top of legion
stdpar + cython
Numba // recommended numba tutorial for GPU programming
CuPy
pycuda
python / C++ CUDA interface (SWIG and Cython)
python / C++ CUDA interface with pybind11
PythonHPC
HPC Python video's
Hands-On GPU Programming with Python and CUDA and examples
2020-geilo-gpu-python
Numba introduction

Machine learning and Deep Learning

https://towardsdatascience.com/fast-data-augmentation-in-pytorch-using-nvidia-dali-68f5432e1f5f
https://ep2019.europython.eu/media/conference/slides/fX8dJsD-distributed-multi-gpu-computing-with-dask-cupy-and-rapids.pdf
https://github.com/NVIDIA/DeepLearningExamples
https://github.com/chagaz/hpc-ai-ml-2019
tensorflow tutorial
AI cheatsheet
m2dsupsdlclass
deep-learning-with-python-notebooks
https://d2l.ai/
Building a neural network FROM SCRATCH (no Tensorflow/Pytorch, just numpy & math)

Physics Informed Neural Networks (PINN)

Artificial Neural Networks for Solving Ordinary and Partial Differential Equations, Lagaris etal, IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 9, NO. 5, SEPTEMBER 1998
Physics Informed Deep Learning (Part I): Data-driven, Solutions of Nonlinear Partial Differential Equations, https://arxiv.org/pdf/1711.10561.pdf
Raissi et al, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, https://doi.org/10.1016/j.jcp.2018.10.045
https://github.com/openhackathons-org/gpubootcamp/tree/master/hpc_ai/PINN
Nvidia Modulus documentation
Nvidia Modulus source code
Nvidia Modulus examples
DeepXDE
TensorDiffEq
SciANN, SciANN examples
neurodiffeq
Julia's DiffEqFlux.jl, NeuralOperators.jl and OperatorLearning
https://github.com/maziarraissi/PINNs
Fourier Neural Operator
a review article : Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next
slides by Lu Lu (Univ. Penn)

Graphics / GPU

https://raytracing.github.io/
https://github.com/RayTracing/raytracing.github.io
https://github.com/rogerallen/raytracinginoneweekendincuda : code très clean, super

OpenMP

https://www.openmp.org/wp-content/uploads/openmp-examples-4.5.0.pdf
https://github.com/OpenMP/Examples/tree/v4.5.0/sources
https://ukopenmpusers.co.uk/wp-content/uploads/uk-openmp-users-2018-OpenMP45Tutorial_new.pdf
https://www.nas.nasa.gov/hecc/assets/pdf/training/OpenMP4.5_3-20-19.pdf
http://www.admin-magazine.com/HPC/Articles/OpenMP-Coding-Habits-and-GPUs?utm_source=AMEP

OpenMP target

How to build yourself clang with OpenMP target support for Nvidia GPUs
- https://hpc-wiki.info/hpc/Building_LLVM/Clang_with_OpenMP_Offloading_to_NVIDIA_GPUs
- //devmesh.intel.com/blog/724749/how-to-build-and-run-your-modern-parallel-code-in-c-17-and-openmp-4-5-library-on-nvidia-gpus
https://www.openmp.org/wp-content/uploads/SC17-OpenMPBooth_jlarkin.pdf
OpenMP 5.0 for accelerators at GTC 2019
LLVM/Clang based compiler for both AMD/NVidia GPUs
OpenMP target examples

How to build clang++ with openmp target (off-loading) support ?

https://devmesh.intel.com/blog/724749/how-to-build-and-run-your-modern-parallel-code-in-c-17-and-openmp-4-5-library-on-nvidia-gpus
https://hpc-wiki.info/hpc/Building_LLVM/Clang_with_OpenMP_Offloading_to_NVIDIA_GPUs

OpenACC

OpenACC Programming and Best Practices Guide
PGI compiler - OpenACC getting started guide
https://www.fz-juelich.de/SharedDocs/Downloads/IAS/JSC/EN/slides/openacc/2-openacc-introduction.pdf?__blob=publicationFile
Introduction to GPU programming using OpenACC
https://github.com/eth-cscs/SummerSchool2019/tree/master/topics/openacc
https://developer.nvidia.com/openacc-overview-course
https://perso.centrale-marseille.fr/~gchiavassa/visible/HPC/01%20-%20GR%20%20Intro%20to%20GPU%20programming%20V2%20OpenACC%20.pdf
Jeff Larkin (Nvidia) Introduction to OpenACC
Jeff Larkin (Nvidia) OpenACC data management
Jeff Larkin (Nvidia) OpenACC optimizations
OpenAcc training material as notebooks
https://www.pgroup.com/resources/docs/19.10/pdf/pgi19proftut.pdf
https://github.com/OpenACCUserGroup/openacc_concept_strategies_book
https://developer.nvidia.com/blog/solar-storm-modeling-gpu-openacc/

Which compiler with OpenAcc support ?

Nvidia/PGI compiler is the oldest and probably more mature OpenACC compiler.
GNU/gcc provided by Spack is the easiest way to get started for OpenMP/OpenACC offload with the GNU compiler.

C++17 and parallel STL for CPU/GPU

accelerating-standard-c-with-gpus-using-stdpar/ for Nivia GPUs
a real life example in CFD: LULESH
another reference in CFD stdpar for Lattice Boltzmann simulation and its companion code
https://github.com/shwina/stdpar-cython/
https://software.intel.com/content/www/us/en/develop/articles/get-started-with-parallel-stl.html

Which compiler ?

Nvidia/PGI compiler for Nvidia GPUs
GNU g++ version >= 9.1 (+ TBB) for multicore CPUs
clang >= 10.0.1 for multicore CPUs
Intel OneApi HPC Toolkit

stdpar for Fortran

https://developer.nvidia.com/blog/accelerating-fortran-do-concurrent-with-gpus-and-the-nvidia-hpc-sdk/
example code euler2d_cudaFortran : solving Euler's equations in Fortran with stdpar (do concurrent loops)

SYCL

Khronos
syclacademy
oneAPI-samples
more oneAPI / SYCL samples
a short tutorial
Compilers / toolchain
- codeplay
- Intel OneAPI. If you want Nvidia GPU support, you'll have to rebuild llvm/clang from the source code, see instructions; OneAPI DPC++ actually is a SYCL implementation + extensions (Unified Shared Memory, Explicit SIMD, ...)
- triSYCL for Xilinx FPGA target
Comparison Kokkos/SYCL (early 2020)

Books on GPU programming / recommended reading

The CUDA Handbook: A Comprehensive Guide to GPU Programming, by Nicholas Wilt, Pearson Education.
CUDA by example, by Sanders and Kandrot, Addison-Wesley, 2010. Also available in pdf
Learn CUDA programming by B. Sharma and J. Han, Packt Publishing, 2019
Python + CUDA : https://github.com/PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA
https://www.oreilly.com/library/view/hands-on-gpu-programming/9781788993913/ by Brian Tuomanen

C++ resources

Discovering Modern C++: An Intensive Course for Scientists, Engineers, and Programmers, and companion github website
https://github.com/changkun/modern-cpp-tutorial
https://github.com/eth-cscs/examples_cpp
https://github.com/mandliya/algorithms_and_data_structures
https://www.fz-juelich.de/SharedDocs/Downloads/IAS/JSC/EN/slides/cplusplus/cplusplus.pdf?__blob=publicationFile
https://gitlab.maisondelasimulation.fr/tpadiole/hpcpp
http://www.cppstdlib.com/
http://101.lv/learn/C++/
https://github.com/caveofprogramming/advanced-cplusplus
https://en.cppreference.com/w/
list of Lists of C++ related resources: https://github.com/fffaraz/awesome-cpp
list of books on C++ : https://github.com/fffaraz/awesome-cpp/blob/master/books.md
C++ idioms
Design Patterns and Book on design patterns for modern c++
Julich training on C++
CSCS computing center training on C++ videos
CppCon and videos on YouTube
Bo Qiang YouTube channel on C++11
https://github.com/TheAlgorithms/C-Plus-Plus
cours de C++ de l'université de Strasbourg

high-level C++ libraries for programming GPUs

Alternate programming models for programming modern computing architectures in a performance portable way:

introduction to performance portability
https://github.com/arrayfire/arrayfire
https://docs.nvidia.com/cuda/thrust/index.html
https://github.com/kokkos/kokkos
https://github.com/LLNL/RAJA et https://github.com/LLNL/RAJA-tutorials
https://github.com/triSYCL/triSYCL
https://github.com/codeplaysoftware/computecpp-sdk

Performance portability

Performance portability

Kokkos/C++ library

https://github.com/kokkos/kokkos
https://github.com/kokkos/kokkos-tutorials
https://github.com/kokkos/kokkos-tutorials/wiki/Kokkos-Lecture-Series
C++ Performance Portability - A Decade of Lessons Learned - Christian Trott - CppCon 2022

CMake

Git

Git cheatsheet

Misc

Udacity CS344 video archive
cuda related : https://gist.github.com/allanmac/f91b67c112bcba98649d - cuda_assert
FPGA, loop transformation, matrix multiplication
Cycle du hype
https://press3.mcs.anl.gov/atpesc/files/2019/08/ATPESC_2019_Dinner_Talk_8_8-7_Foster-Coding_the_Continuum.pdf

Shell and command line skills

Learn/improve your skill on Linux’s command line/Bash e.g. http://swcarpentry.github.io/shell-novice/
http://www.tldp.org/LDP/abs/html/
http://www.epons.org/commandes-base-linux.php
The art of command line

Blogs or newsletters on HPC

https://www.nextplatform.com/
subscribe blog/news letters on HPC; e.g. Admin-magazine / HPC
(En anglais) Intel Parallel Universe Magazine

MOOC

Amazon
udemy

Projet

Portage d'un code C++ de simulation des équations de Navier-Stokes par la méthode de Boltzmann sur réseau.

MS-HPC-AI-GPU
MS-HPC-AI-GPU copied to clipboard

Metadata

MS-HPC-AI-GPU

biblio

Scientific Programming and Computer Architecture

CUDA / GPU training

CUDA / performance analysis

Other CUDA resources

CUDA / python

Machine learning and Deep Learning

Physics Informed Neural Networks (PINN)

Graphics / GPU

OpenMP

OpenMP target

OpenACC

C++17 and parallel STL for CPU/GPU

stdpar for Fortran

SYCL

Books on GPU programming / recommended reading

C++ resources

high-level C++ libraries for programming GPUs

Performance portability

Kokkos/C++ library

CMake

Git

Misc

Shell and command line skills

Blogs or newsletters on HPC

MOOC

Projet

← Metadata

Owner

Metadata

MS-HPC-AI-GPU MS-HPC-AI-GPU copied to clipboard

Metadata

MS-HPC-AI-GPU

biblio

Scientific Programming and Computer Architecture

CUDA / GPU training

CUDA / performance analysis

Other CUDA resources

CUDA / python

Machine learning and Deep Learning

Physics Informed Neural Networks (PINN)

Graphics / GPU

OpenMP

OpenMP target

OpenACC

C++17 and parallel STL for CPU/GPU

stdpar for Fortran

SYCL

Books on GPU programming / recommended reading

C++ resources

high-level C++ libraries for programming GPUs

Performance portability

Kokkos/C++ library

CMake

Git

Misc

Shell and command line skills

Blogs or newsletters on HPC

MOOC

Projet

← Metadata

Owner

Metadata

MS-HPC-AI-GPU
MS-HPC-AI-GPU copied to clipboard