StreamRF icon indicating copy to clipboard operation
StreamRF copied to clipboard

Official implementation of our NeurIPS paper "Streaming Radiance Fields for 3D Video Synthesis"

Streaming Radiance Fields for 3D Video Synthesis

Lingzhi Li, Zhen Shen, Zhongshu Wang, Li Shen, Ping Tan

Alibaba Group

Citation:


@article{li2022streaming,
  title={Streaming radiance fields for 3d video synthesis},
  author={Li, Lingzhi and Shen, Zhen and Wang, Zhongshu and Shen, Li and Tan, Ping},
  journal={Advances in Neural Information Processing Systems},
  volume={35},
  pages={13485--13498},
  year={2022}
}

arXiv: https://arxiv.org/abs/2210.14831

https://user-images.githubusercontent.com/28325733/210695784-a309dce8-533b-4c93-b637-da369e2a288e.mp4

Due to size limit, this is a downsampled video, check full resolution video here.

Dataset

Meet Room Dataset: ModelScope魔搭Google Drive

We will add more data in ModelScope. (我们会在魔搭里更新更多的数据,敬请关注)

N3DV Dataset: https://github.com/facebookresearch/Neural_3D_Video

Training StreamRF

Following the setup of the orginal plenoxels' repository

For each scene, extract frames from every video, and arrange them into the following structure:

python prepare_dataset.py <video_dir>
<data_dir>
  ├── 0000  
  |   ├── poses_bounds.npy  
  |   └── images
  |       └── cam[00/01/02/.../20].png
  ...
  └── 0299 
      ├── poses_bounds.npy  
      └── images
          └── cam[00/01/02/.../20].png

We provide the pose_bounds.npy of both dataset in the Meet Room Dataset's link. If you wants to generate poses_bounds.npy for yourself check DS-NeRF's repo.

Meet Room Dataset

  1. Initialize the first frame model
python opt.py  -t <log_dir> <data_dir>/0000 -c configs/meetroom_init.json --scale 1.0
  1. Train the pilot model
python train_video_n3dv_pilot.py -t <log_dir> <data_dir> -c configs/meetroom.json --batch_size 20000   --pretrained <pretrained_ckpt>  --n_iters 1000    --lr_sigma 0.3  --lr_sigma_final 0.3  --lr_sh 1e-2 --lr_sh_final 1e-4 --lr_sigma_decay_steps 1000 --lr_sh_decay_steps 1000   --frame_end 300 --fps 30 --train_use_all 0 --scale 1.0 --sh_keep_thres 1.0 --sh_prune_thres 0.1 --performance_mode  --dilate_rate_before 1 --dilate_rate_after 1 --stop_thres 0.01  --compress_saving --save_delta   --pilot_factor 2 
  1. Train the full model
python train_video_n3dv_full.py -t <log_dir> <data_dir> -c configs/meetroom_full.json --batch_size 20000   --pretrained <pretrained_ckpt>  --n_iters 500    --lr_sigma 1.0 --lr_sigma_final 1.0  --lr_sh 1e-2 --lr_sh_final 1e-2 --lr_sigma_decay_steps 500 --lr_sh_decay_steps 500   --frame_end 300 --fps 30 --train_use_all 0 --scale 1.0 --sh_keep_thres 1.5 --sh_prune_thres 0.3 --performance_mode  --dilate_rate_before 2 --dilate_rate_after 2    --compress_saving --save_delta  --apply_narrow_band 

N3DV Dataset

  1. Initialize the first frame model
python opt.py  -t <log_dir> <data_dir>/0000 -c configs/init_ablation/n3dv_init.json --offset 500 --scale 0.5 --nosphereinit 
  1. Train the pilot model
python train_video_n3dv_pilot.py -t <log_dir> <data_dir> -c configs/n3dv.json --batch_size 20000    --pretrained <pretrained_ckpt>  --n_iters 750    --lr_sigma 1.0  --lr_sigma_final 1.0  --lr_sh 1e-2 --lr_sh_final 1e-3 --lr_sigma_decay_steps 750 --lr_sh_decay_steps 750   --frame_end 300 --fps 30 --train_use_all 0 --offset 750  --scale 0.5 --sh_keep_thres 0.5 --sh_prune_thres 0.1 --performance_mode  --dilate_rate_before 1 --dilate_rate_after 1 --stop_thres 0.01  --compress_saving --save_delta   --pilot_factor 2 
  1. Train the full model
python train_video_n3dv_full.py -t <log_dir> <data_dir> -c configs/n3dv_full.json --batch_size 20000   --pretrained  <pretrained_ckpt>  --n_iters 500    --lr_sigma 1.0 --lr_sigma_final 1.0  --lr_sh 1e-2 --lr_sh_final 3e-3 --lr_sigma_decay_steps 500 --lr_sh_decay_steps 300   --frame_end 300 --fps 30 --train_use_all 0 --offset 1500  --scale 0.5 --sh_keep_thres 1.0 --sh_prune_thres 0.2 --performance_mode  --dilate_rate_before 2 --dilate_rate_after 2  --stop_thres 0.01  --compress_saving --save_delta  --apply_narrow_band

We can change the tv loss weight for better video quality but lower PSNR

  1. Train the pilot model
python train_video_n3dv_full.py -t <log_dir> <data_dir> -c configs/n3dv_full_hightv.json --batch_size 20000   --pretrained  <pretrained_ckpt>  --n_iters 500    --lr_sigma 1.0 --lr_sigma_final 1.0  --lr_sh 1e-2 --lr_sh_final 3e-3 --lr_sigma_decay_steps 500 --lr_sh_decay_steps 300   --frame_end 300 --fps 30 --train_use_all 0 --offset 1500  --scale 0.5 --sh_keep_thres 1.0 --sh_prune_thres 0.2 --performance_mode  --dilate_rate_before 2 --dilate_rate_after 2  --stop_thres 0.01  --compress_saving --save_delta  --apply_narrow_band
```bash

## Testing StreamRF

For Meet Room Dataset:
```bash
python render_delta.py  -t <log_dir> <data_dir> -c configs/meetroom_full.json --batch_size 20000    --pretrained <pretrained_ckpt>  --frame_end 300 --fps 30 --scale 1.0 --performance_mode  

For N3DV Dataset:

python render_delta.py  -t <log_dir> <data_dir> -c configs/n3dv_full.json --batch_size 20000    --pretrained <pretrained_ckpt>  --frame_end 300 --fps 30 --scale 0.5 --performance_mode