lari
lari copied to clipboard
[arXiv'25]๐ Unseen 3D Geometry Reasoning from a Single Image.
LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning
Rui Li1 ยท Biao Zhang1 ยท Zhenyu Li1 ยท Federico Tombari2,3 ยท Peter Wonka2,3
1KAUST ยท 2Google ยท 3Technical University of Munich
arXiv 2025
LaRI is a single-feed-forward method that models unseen 3D geometry using layered point maps. It enables complete, efficient, and view-aligned geometric reasoning from a single image.
๐ TODO List
- [x] Inference code & Gradio demo
- [x] Evaluation data & code
- [x] Training data & code
- [ ] Release the GT generation code (Estimated time: within July, 2025)
๐ ๏ธ Environment Setup
- Create the conda environment and install required libraries:
conda create -n lari python=3.10 -y
conda activate lari
pip install -r requirements.txt
- Install Pytorch3D following these instructions.
๐ Quick Start
We currently provide the object-level model at our HuggingFace Model Hub. Try the examples or use your own images with the methods below:
Gradio Demo
Launch the Gradio interface locally:
python app.py
Or try it online via HuggingFace Demo.
Command Line
Run object-level modeling with:
python demo.py --image_path assets/cole_hardware.png
The input image path is specified via
--image_path. Set--is_remove_backgroundto remove the background. Layered depth maps and the 3D model will be saved in the./resultsdirectory by default.
๐ Evaluation
Pre-trained weights and Evaluation Data
| Scene Type | Pre-trained Weights | Evaluation Data |
|---|---|---|
| Object-level | checkpoint | Google Scanned Objects (data) |
| Scene-level | checkpoint | SCCREAM (data) |
Download the pre-trained weights and unzip the evaluation data.
Object-level Evaluation
./scripts/eval_object.sh
Scene-level Evaluation
./scripts/eval_scene.sh
NOTE: For both object and scene evaluation, set data_path and test_list_path to the customized absolute paths, set --pretrained to your model checkpoint path, and set --output_dir to specify where to store the evaluation results.
๐ป Training
๐พ Dataset setup
1. Objaverse (object-level)
Download the processed Objaverse dataset, extract all files (objaverse_chunk_<ID>.tar.gz) into the target folder, for example:
mkdir ./datasets/objaverse_16k
tar -zxvf ./objaverse_chunk_<ID>.tar.gz -C ./datasets/objaverse_16k
2. 3D-FRONT (scene-level)
Download the processed 3D-FRONT dataset, extract all files to the target folder. For example:
mkdir ./datasets/3dfront
tar -zxvf ./front3d_chunk_<ID>.tar.gz -C ./datasets/3dfront
3. ScanNet++ (scene-level)
- Download the ScanNet++ dataset, as well as the ScanNet++ toolbox.
- Copy the
.ymlconfiguration files to the ScanNet++ toolbox folder, for example:
cd /path/to/lari
cp -r ./scripts/scannetpp_proc/*.yml /path/to/scannetpp/scannetpp/dslr/configs
- Run the following command in the ScanNet++ toolbox folder to downscale and undistort the data.
cd /path/to/scannetpp
# downscale the images
python -m dslr.downscale dslr/configs/downscale_lari.yml
# undistort the images
python -m dslr.undistort dslr/configs/undistort_lari.yml
- Download the ScanNet++ annotation from here and extract it to the
datasubfolder of your ScanNet++ path, for example
tar -zxvf ./scannetpp_48k_annotation.tar.gz -C ./datasets/scannetpp_v2/data
๐ฅ Train the model
Download MoGe pre-trained weights. For training with object-level data (Objaverse), run
./scripts/train_object.sh
For training with scene-level data (3D-FRONT and ScanNet++), run
./scripts/train_scene.sh
For both training settings, set data_path, train_list_path and test_list_path of each dataset to your customized absolute paths, set pretrained_path to the downloaded MoGe weights path, set --output_dir and --wandb_dir to specify where to store the evaluation results.
โจ Acknowledgement
This prject is largely based on DUSt3R, with some model weights and functions from MoGe, Zero-1-to-3, and Marigold. Many thanks to these awesome projects for their contributions.
๐ฐ Citation
Please cite our paper if you find it helpful:
@inproceedings{li2025lari,
title={LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning},
author={Li, Rui and Zhang, Biao and Li, Zhenyu and Tombari, Federico and Wonka, Peter},
booktitle={arXiv preprint arXiv:2504.18424},
year={2025}
}