Learning Ego 3D Representation as Ray Tracing

Website | Paper

Learning Ego 3D Representation as Ray Tracing,
Jiachen Lu, Zheyuan Zhou, Xiatian Zhu, Hang Xu, Li Zhang
ECCV 2022

Demo

Video

News

[2022/07/19]: Configs and instructions for training are released!
[2022/07/05]: First version of Ego3RT is released! Code for detection head and training configs will comming soon.
[2022/07/04]: Ego3RT is accepted by ECCV 2022!

Abstract

A self-driving perception model aims to extract 3D semantic representations from multiple cameras collectively into the bird's-eye-view (BEV) coordinate frame of the ego car in order to ground downstream planner. Existing perception methods often rely on error-prone depth estimation of the whole scene or learning sparse virtual 3D representations without the target geometry structure, both of which remain limited in performance and/or capability. In this paper, we present a novel end-to-end architecture for ego 3D representation learning from an arbitrary number of unconstrained camera views. Inspired by the ray tracing principle, we design a polarized grid of ``imaginary eyes" as the learnable ego 3D representation and formulate the learning process with the adaptive attention mechanism in conjunction with the 3D-to-2D projection. Critically, this formulation allows extracting rich 3D representation from 2D images without any depth supervision, and with the built-in geometry structure consistent w.r.t. BEV. Despite its simplicity and versatility, extensive experiments on standard BEV visual tasks (e.g., camera-based 3D object detection and BEV segmentation) show that our model outperforms all state-of-the-art alternatives significantly, with an extra advantage in computational efficiency from multi-task learning.

Methods

Train & Test

Please refer to the get_started.md

Result

3D object detection on nuScenes validation set

Model	Polar size	mAP	NDS	checkpoint
Ego3RT, ResNet101_DCN	80x256	37.5	45.0
Ego3RT, ResNet101_DCN	72x192	37.5	44.9	ego3rt_polar72x192_cart128x128.pth
Ego3RT, VoVNet	80x256	47.8	53.4

3D object detection on nuScenes test set

Model	Polar size	mAP	NDS
Ego3RT, ResNet101_DCN	80x256	38.9	44.3
Ego3RT, VoVNet	80x256	42.5	47.3

BEV segmentation on nuScenes validation set

Model	Polar size	Multitask	mIoU
Ego3RT, EfficientNet	80x256	no	55.5
Ego3RT, ResNet101_DCN	80x256	yes	46.2

License

MIT

Reference

@inproceedings{lu2022ego3rt,
  title={Learning Ego 3D Representation as Ray Tracing},
  author={Lu, Jiachen and Zhou, Zheyuan and Zhu, Xiatian and Xu, Hang and Zhang, Li},
  booktitle={European Conference on Computer Vision},
  year={2022}
}

Acknowledgement

Thanks to previous open-sourced repo:

Ego3RT
Ego3RT copied to clipboard

Metadata

Learning Ego 3D Representation as Ray Tracing

Website | Paper

Demo

Video

News

Abstract

Methods

Train & Test

Result

3D object detection on nuScenes validation set

3D object detection on nuScenes test set

BEV segmentation on nuScenes validation set

License

Reference

Acknowledgement

← Metadata

Owner

Metadata

Ego3RT Ego3RT copied to clipboard

Metadata

Learning Ego 3D Representation as Ray Tracing

Website | Paper

Demo

Video

News

Abstract

Methods

Train & Test

Result

3D object detection on nuScenes validation set

3D object detection on nuScenes test set

BEV segmentation on nuScenes validation set

License

Reference

Acknowledgement

← Metadata

Owner

Metadata

Ego3RT
Ego3RT copied to clipboard