PapersAnalysis
PapersAnalysis copied to clipboard
Complex-YOLO Paper - Intro - Analysis
Overview
Complex-YOLO: Real-time 3D Object Detection on Point Clouds
Arxiv: https://arxiv.org/abs/1803.06199
Key | Value |
---|---|
Type of Contribution | DNN Model |
Application | Object Detection from 3D Point Cloud provided by Lidar |
Domain of Application | Autonomous Car |
Goal
Develop a pipeline connecting
-
Input: Point Cloud as BEV RGB Map
- Obtained as preprocessing
- RGB means just 3 channels image, encoding:
-
- height from the ground
-
- laser ray intensity
-
- density (resulting from spatial discretization)
-
- Output: Detection Results (Bounding Box + Semantic Label)
- From the Paper Fig1
Related Work
- Used for
- Classification
- Parts Segmentation
- Semantic Segmentation
Work Challenges
Description
- Design a NN able to take sparse data as input, which is difficult in general
- Performing an adaptation of an existing network, designed to perform detection on a dense input
Notes
- Actually this has been performed with a pre-processing step to convert a the Point Cloud which is a Sparse Data Structure into an Image which is a Dense Data Structure
- This could possibly add some extra computational cost and be sensitive to the non trainable pre-processing related hyperparams like Grid Cell Size
BEV Construction
- Prior on the space with a certain regular geometry
- Each cell is 8cm square
- Detected Points get projected on the grid according to sensor extrinsic calibration and this defines a Lidar point to cell association
- According to this association it is possible to compute cell specific statistics defining the values for the 3 channels: average height, average intensity and density as number of points in the cell
Procedure
1. Preprocessing
1.1 Project into Grid
Goals
- Removing Ego Perspective so to work in BEV
- Introduce Spatial Quantization
1.2 Compute Statistics on Spatial Grids
Goals
- Build an RGB like representation, more precisely a
WxHx3
Tensor with channels representing- height from ground
- intensity
- cell density
2. Processing
- Use a YOLO like NN to process the representation resulting from preprocessing step
Processing Strategy
- This method relies on a specifically designed Region Proposal Network
- One of the most famous improvement in Object Detection Research, introduced in Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks