centernet_kinect copied to clipboard
Real-time CenterNet based object detection on fused IR/Depth images from Kinect sensor. Works on NVIDIA Jetson.
This repository demonstrates how to set up Azure Kinect camera with your Jetson platform, collect and annotate data, train a object ddetection model with the collected data, and finally run a real time object detection model on your development kit.
Install Sensor SDK on Jetson
- Install python packages
- Get pretrained weights
Collect/Annotate Data
- Data collection
- Annotate data
- Train Model
Install Sensor SDK on Jetson
Note in this tutorial we will be installing the SDK on Ubuntu Version 18.04.5 LTS".
To check your distribution/version you can run the following command
cat /etc/os-release
Here is the link to microsoft package repository in case you are using any other distribution/version.
Here are more instructions on how to configure and install the SDK on other platforms Link
1. Add Microsoft's product repository for ARM64
curl | sudo apt-key add -
sudo apt-add-repository
sudo apt-get update
2. Install Kinect Package
sudo apt install k4a-tools
sudo apt install libk4a1.4-dev
3. Setup udev rules
in order to use the Azure Kinect SDK with the device and without being 'root', you will need to setup udev rules Link
Copy 'scripts/99-k4a.rules' into '/etc/udev/rules.d/'.
Detach and reattach Azure Kinect devices if attached during this process.
3. Setup udev rules
- test the SDK
Note Here are more instuctions if you were experiencing dificulty with yout setup Link
Install python packages
- install python pip
sudo apt update
sudo apt-get install python3-pip
- install torch 1.7.0. Here is the Link to choose the correct versions.
wget -O torch-1.7.0-cp36-cp36m-linux_aarch64.whl
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
pip3 install Cython
pip3 install numpy torch-1.7.0-cp36-cp36m-linux_aarch64.whl
- install torch vision 0.8.1
sudo apt-get install libjpeg-dev zlib1g-dev libpython3-dev libavcodec-dev libavformat-dev libswscale-dev
git clone --branch v0.8.1 torchvision
cd torchvision
export BUILD_VERSION=0.8.1 # where 0.8.1 is the torchvision version
sudo python3 install
cd ../ # attempting to load torchvision from build dir will result in import error
pip3 install 'pillow<7' # always needed for Python 2.7, not needed torchvision v0.5.0+ with Python 3.6
- install requirments
# Make sure to install Protobuf compiler before running pip installation of onnx
sudo apt-get install protobuf-compiler libprotoc-dev
pip3 install -r requirments.txt
- install torch2trt for compiling the tensorrt model, here is a Link
Get pretrained weights
- get the pretrained weight from the following link and place the file in:
cp DOWNLOADED_WEIGHTS PATH/TO/PROJECT/checkpoint/Logistic_ResnetCenterNet_fused.pth
Collect/Annotate Data
We will discuss data collection and annotation in this section
1. Data collection
- Setup JSON_ANNOTATION_PATH and SAVE_DATASET_PATH in pipeline/ file
- SAVE_DATASET_PATH setups the location to save the dataset. here is how the dataset directory hierarchy is setup by defaul it is set to PATH/TO/PROJ/annotation_json/
data ├── annotation ├── casted_ir_image ├── depth_image ├── ir_image └── rgb_image
- JSON_ANNOTATION_PATH setups the location to store the annotation json files to train the model (i.e train.json, val.json) these files will be created after parsing the annotation (.xml) files. by defaul it is set to PATH/TO/PROJ/annotation_json/
{"img_path": "/path/to/image.png", "chw": [1, 576, 640], "boxes": [[345, 191, 384, 235], [424, 185, 467, 223], [309, 401, 341, 430], [152, 430, 198, 483]], "labels": [1, 1, 1, 1]}
2. Annotate data
- We used labelImg to annotate out dataset. you can install and run by running teh follwoing command
pip install labelImg
- Open the DATASETPATH/data/casted_ir_image directory to load images and set the Save Dir to DATASETPATH/data/annotation and start annotating
- When finished with annotating the data you need to parse the Pascal VOC format and create json files containging annoation information. run pipeline/ to create
- label_map.json
- train.json
- val.json
Train Model
Having the dataset ready We need to train the model (it is recomended to collect and annotate over 3k images for better performance).
1. setup training parameters
- CHECKPOINT_PATH the path to save the model checkpoints (default *PATH/TO/PROJ/checkpoint)
- you can set the to either train with only depth images or with ir and depth fused images (setup in pipeline/
- depth input images: image size (3, 300, 300) dublicate the image for all the 3 input channels
- fused input images: channel1 ir images, channel 2 depth image, channel 3 mean of the first 2 channels
- loss functions: you can either use Logistic or MSE loss for the heatmap regression (setup in pipeline/
- setup max epoch in pipeline/*
- Model naming convention: LossFunc_ModelName_dataType.pth (i.e Logistic_CenterNet_fused.pth or Logistic_CenterNet_depth.pth)
2. train
run training by
python3 train # To start training a model from scratch
python3 train -r True # To continue an already trained model
3. visualize from checkpoint
run training by
python3 # To run inference on a validation image
4. run real time inference
run training by
python3 # To run real time inference