PLA icon indicating copy to clipboard operation
PLA copied to clipboard

(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding & (CVPR2024) RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

1The University of Hong Kong  2ByteDance
*equal contribution  +corresponding author

CVPR 2023

TL;DR: PLA leverages powerful VL foundation models to construct hierarchical 3D-text pairs for 3D open-world learning.

working space piano vending machine

project page | arXiv

TODO

  • [ ] Release caption processing code

Getting Started

Installation

Please refer to INSTALL.md for the installation.

Dataset Preparation

Please refer to DATASET.md for dataset preparation.

Training & Inference

Please refer to MODEL.md for training and inference scripts and pretrained models.

Citation

If you find this project useful in your research, please consider cite:

@inproceedings{ding2022language,
    title={PLA: Language-Driven Open-Vocabulary 3D Scene Understanding},
    author={Ding, Runyu and Yang, Jihan and Xue, Chuhui and Zhang, Wenqing and Bai, Song and Qi, Xiaojuan},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2023}
}

Acknowledgement

Code is partly borrowed from OpenPCDet, PointGroup and SoftGroup.