PapersAnalysis icon indicating copy to clipboard operation
PapersAnalysis copied to clipboard

Complex-YOLO Paper - Intro - Analysis

Open NicolaBernini opened this issue 6 years ago • 5 comments

Overview

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Arxiv: https://arxiv.org/abs/1803.06199

image

Key Value
Type of Contribution DNN Model
Application Object Detection from 3D Point Cloud provided by Lidar
Domain of Application Autonomous Car

NicolaBernini avatar Feb 22 '19 15:02 NicolaBernini

Goal

Develop a pipeline connecting

  • Input: Point Cloud as BEV RGB Map
    • Obtained as preprocessing
    • RGB means just 3 channels image, encoding:
        1. height from the ground
        1. laser ray intensity
        1. density (resulting from spatial discretization)
  • Output: Detection Results (Bounding Box + Semantic Label)

ComplexYOLO_IO1

  • From the Paper Fig1

Related Work

PointNet

  • Used for
    • Classification
    • Parts Segmentation
    • Semantic Segmentation

NicolaBernini avatar Feb 22 '19 15:02 NicolaBernini

Work Challenges

Description

  1. Design a NN able to take sparse data as input, which is difficult in general
  2. Performing an adaptation of an existing network, designed to perform detection on a dense input

Notes

  • Actually this has been performed with a pre-processing step to convert a the Point Cloud which is a Sparse Data Structure into an Image which is a Dense Data Structure
  • This could possibly add some extra computational cost and be sensitive to the non trainable pre-processing related hyperparams like Grid Cell Size

NicolaBernini avatar Feb 22 '19 16:02 NicolaBernini

BEV Construction

  • Prior on the space with a certain regular geometry
    • Each cell is 8cm square
  • Detected Points get projected on the grid according to sensor extrinsic calibration and this defines a Lidar point to cell association
  • According to this association it is possible to compute cell specific statistics defining the values for the 3 channels: average height, average intensity and density as number of points in the cell

NicolaBernini avatar Feb 22 '19 17:02 NicolaBernini

Procedure

1. Preprocessing

1.1 Project into Grid

Goals

  • Removing Ego Perspective so to work in BEV
  • Introduce Spatial Quantization

1.2 Compute Statistics on Spatial Grids

Goals

  • Build an RGB like representation, more precisely a WxHx3 Tensor with channels representing
    • height from ground
    • intensity
    • cell density

2. Processing

  • Use a YOLO like NN to process the representation resulting from preprocessing step

NicolaBernini avatar Feb 25 '19 12:02 NicolaBernini

Processing Strategy

NicolaBernini avatar Feb 25 '19 15:02 NicolaBernini