AI-Engineer-Note

A collection for AI Engineer & Deploy Services

Deeplearning
- 1. ComputerVision
  - 1.1 Common Architectures
    - 1.1.1 ResBlock
    - 1.1.2 Gated Convolution
    - 1.1.3 Multi-head Attention
- 2. NLP
Frameworks
- 1. TensorRT
  - 1.1 Convert ONNX model to TensorRT
  - 1.2 Wrapped TensorRT-CPP Models
- 2. Pytorch
  - 2.1 Build Pytorch from source (Optimize speed for AMD CPU & NVIDIA GPU)
Deploy
- 1. NVIDIA
  - 1.1 Multi-instance GPU (MIG)
  - 1.2 FFMPEG with Nvidia hardware-acceleration
- 2. Deepstream
  - 2.1 Yolov4
  - 2.2 Traffic Analyst
  - 2.3 SCRFD Face Detection (custom parser & NMS plugin with landmark)
- 3. Triton Inference Server
  - 3.1 Cài đặt triton-server và triton-client
    - 3.1.1 Các chế độ quản lý model (load/unload/reload)
  - 3.2 Sơ lược về các backend trong Triton
  - 3.3 Cấu hình cơ bản khi deploy mô hình
  - 3.4 Deploy mô hình
    - 3.4.1 ONNX-runtime
    - 3.4.2 TensorRT
    - 3.4.3 Pytorch & TorchScript
    - 3.4.4 Kaldi (Advanced)
  - 3.5 Model Batching
  - 3.6 Ensemble Model và pre/post processing
  - 3.7 Sử dụng Performance Analyzer Tool
  - 3.8 Optimizations
    - 3.8.1 Tối ưu Pytorch backend
- 4. TAO Toolkit (Transfer-Learning-Toolkit)

Linux & CUDA & APT-Packages

Build OpenCV from source
- Build OpenCV from source

Install Math Kernel Library (MKL/BLAS/LAPACK/OPENBLAS)

You are recommended to install all Math Kernel Library and then compile framework (e.g pytorch, mxnet) from source using custom config for optimization. Install all LAPACK+BLAS:

sudo apt install libjpeg-dev libpng-dev libblas-dev libopenblas-dev libatlas-base-dev liblapack-dev liblapacke-dev gfortran

Install MKL:

# Get the key
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
# now install that key
apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
# now remove the public key file exit the root shell
rm GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
# Add to apt
sudo wget https://apt.repos.intel.com/setup/intelproducts.list -O /etc/apt/sources.list.d/intelproducts.list
sudo sh -c 'echo deb https://apt.repos.intel.com/mkl all main > /etc/apt/sources.list.d/intel-mkl.list'
# Install
sudo apt-get update
sudo apt-get install intel-mkl-2020.4-912

Fresh install NVIDIA driver (PC/Laptop/Workstation)

# Remove old packages
sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get install ubuntu-desktop
sudo apt-get --purge remove "*cublas*" "cuda*"
sudo apt-get --purge remove "*nvidia*"
sudo add-apt-repository --remove ppa:graphics-drivers/ppa
sudo rm /etc/X11/xorg.conf
sudo apt autoremove
sudo reboot

# After restart
sudo ubuntu-drivers devices
sudo ubuntu-drivers autoinstall
sudo reboot

Install CuDNN

Install keyring: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/#network-repo-installation-for-ubuntu Install CuDNN9 with CUDA 11
```
sudo apt-get update
sudo apt-get -y install cudnn9-cuda-11
```

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver

First, make sure that you have "Fresh install NVIDIA driver". If not work, try this bellow
- Make sure the package nvidia-prime is installed:
```
sudo apt install nvidia-prime
```
Afterwards, run
```
sudo prime-select nvidia
```
- Make sure that NVIDIA is not in blacklist
```
grep nvidia /etc/modprobe.d/* /lib/modprobe.d/*
```
to find a file containing blacklist nvidia and remove it, then run
```
sudo update-initramfs -u
```
Get boot log
```
journalctl -b | grep NVIDIA
```
- If get error This PCI I/O region assigned to your NVIDIA device is invalid:
```
sudo nano /etc/default/grub
```
edit GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=realloc=off"
```
sudo update-grub
sudo reboot
```
Check current CUDA version
```
nvcc --version
```
Check current supported CUDA versions
```
ls /usr/local/
```

Select GPU devices

CUDA_VISIBLE_DEVICES=<index-of-devices> <command>
CUDA_VISIBLE_DEVICES=0 python abc.py
CUDA_VISIBLE_DEVICES=0 ./sample.sh
CUDA_VISIBLE_DEVICES=0,1,2,3 python abc.py
CUDA_VISIBLE_DEVICES=0,1,2,3 ./sample.sh

Switch CUDA version

CUDA_VER=11.3
export PATH="/usr/local/cuda-$CUDA_VER/bin:$PATH"
export LD_LIBRARY_PATH=/usr/local/cuda-$CUDA_VER/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Check NVENV/NVDEC status
```
nvidia-smi dmon
```
see the tab %enc and %dec
Error with distributed training NCCL (got freezed)
```
export NCCL_P2P_DISABLE="1"
```

Broken pipe (Distributed training with NCCL)

Run training with args

NCCL_DEBUG=INFO TORCH_CPP_LOG_LEVEL=INFO TORCH_DISTRIBUTED_DEBUG=INFO torchrun ...

to gather socket name (e.g eno1)

NCCL INFO NET/IB : No device found.
rnd3:77634:79720 [0] NCCL INFO NET/Socket : Using [0]eno1:10.9.3.241<0>
rnd3:77634:79720 [0] NCCL INFO Using network Socket

In other nodes, run with arg

NCCL_SOCKET_IFNAME=eno1

Install CMake from source

version=3.23
build=2 ## don't modify from here
mkdir ~/temp
cd ~/temp
wget https://cmake.org/files/v$version/cmake-$version.$build.tar.gz
tar -xzvf cmake-$version.$build.tar.gz
cd cmake-$version.$build/
./bootstrap
make -j8
sudo make install

Install NCCL Backend (Distributed training)

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt install libnccl2 libnccl-dev

Install MXNet from source

git clone --recursive --branch 1.9.1 https://github.com/apache/incubator-mxnet.git mxnet
cd mxnet
cp config/linux_gpu.cmake config.cmake
rm -rf build
mkdir -p build && cd build
cmake -DUSE_CUDA=ON -DUSE_CUDNN=OFF -DUSE_MKL_IF_AVAILABLE=OFF -DUSE_MKLDNN=OFF -DUSE_OPENMP=OFF -DUSE_OPENCV=ON -DUSE_BLAS=open ..
make -j32
cd ../python
pip install --user -e .

Tensorflow could not load dynamic library 'cudart64_101.dll'
For above example tensorflow would require CUDA 10.1, please switch to CUDA 10.1 or change tensorflow version which compatible with CUDA version, check here: https://www.tensorflow.org/install/source#gpu

Fix Deepstream (6.2+) FFMPEG OpenCV installation

Fix some errors about undefined reference & not found of libavcodec, libavutil, libvpx, ...

apt-get install --reinstall --no-install-recommends -y libavcodec58 libavcodec-dev libavformat58 libavformat-dev libavutil56 libavutil-dev gstreamer1.0-libav
apt install --reinstall gstreamer1.0-plugins-good
apt install --reinstall libvpx6 libx264-155 libx265-179 libmpg123-0 libmpeg2-4 libmpeg2encpp-2.1-0
gst-inspect-1.0 | grep 264
rm ~/.cache/gstreamer-1.0/registry.x86_64.bin
apt install --reinstall libx264-155
apt-get install gstreamer1.0-libav
apt-get install --reinstall gstreamer1.0-plugins-ugly

Gstreamer pipeline to convert MP4-MP4 with re-encoding

gst-launch-1.0 filesrc location="<path-to-input>" ! qtdemux ! video/x-h264 ! h264parse ! avdec_h264 ! videoconvert ! x264enc ! h264parse ! qtmux ! filesink location=<path-to-output>

Gstreamer pipeline to convert RTSP-RTMP

gst-launch-1.0 rtspsrc location='rtsp://<path-to-rtsp-input>' ! rtph264depay ! h264parse ! flvmux ! rtmpsink location='rtmp://rtmp://<path-to-rtmp-output>'

Gstreamer pipeline to convert RTSP-RTMP with reducing resolution

gst-launch-1.0 rtspsrc location='rtsp://<path-to-rtsp-input>' ! rtpbin ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! videoscale ! video/x-raw,width=640,height=640 ! x264enc ! h264parse ! flvmux streamable=true ! rtmpsink location='rtmp://<path-to-rtmp-output>'

AI-Engineer-Note
AI-Engineer-Note copied to clipboard

Metadata

AI-Engineer-Note

← Metadata

Owner

Metadata

AI-Engineer-Note AI-Engineer-Note copied to clipboard

Metadata

AI-Engineer-Note

← Metadata

Owner

Metadata

AI-Engineer-Note
AI-Engineer-Note copied to clipboard