2dimageto3dmodel icon indicating copy to clipboard operation
2dimageto3dmodel copied to clipboard

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and...

An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering

Papers with code | Paper

Nikola Zubić   Pietro Lio  

University of Novi Sad   University of Cambridge

AIAI 2021

Citation

Besides AIAI 2021, our paper is in a Springer's book entitled "Artificial Intelligence Applications and Innovations": link

Please, cite our paper if you find this code useful for your research.

@article{zubic2021effective,
  title={An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering},
  author={Zubi{\'c}, Nikola and Li{\`o}, Pietro},
  journal={arXiv preprint arXiv:2103.03390},
  year={2021}
}

Prerequisites

  • Download code:
    Git clone the code with the following command:

    git clone https://github.com/NikolaZubic/2dimageto3dmodel.git
    
  • Open the project with Conda Environment (Python 3.7)

  • Install packages:

    conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch
    

    Then git clone Kaolin library in the root (2dimageto3dmodel) folder with the following commit and run the following commands:

    cd kaolin
    git checkout e7e513173b
    python setup.py install
    pip install --no-dependencies nuscenes-devkit opencv-python-headless scikit-learn joblib pyquaternion cachetools
    pip install packaging
    

Run the program

Run the following commands from the root/code/ (2dimageto3dmodel/code/) directory:

python main.py --dataset cub --batch_size 16 --weights pretrained_weights_cub --save_results

for the CUB Birds Dataset.

python main.py --dataset p3d --batch_size 16 --weights pretrained_weights_p3d --save_results

for the Pascal 3D+ Dataset.

The results will be saved at 2dimageto3dmodel/code/results/ path.

Continue training

To continue the training process:
Run the following commands (without --save_results) from the root/code/ (2dimageto3dmodel/code/) directory:

python main.py --dataset cub --batch_size 16 --weights pretrained_weights_cub

for the CUB Birds Dataset.

python main.py --dataset p3d --batch_size 16 --weights pretrained_weights_p3d

for the Pascal 3D+ Dataset.

Generation of Pseudo-ground-truths

In these reconstruction steps, we need a trained mesh estimation model. We can use the pre-trained model (already provided) or train it from scratch. The Pseudo-ground-truth data for CUB birds is generated in the following way:

python run_reconstruction.py --name pretrained_reconstruction_cub --dataset cub --batch_size 10 --generate_pseudogt

For Pascal 3D+ dataset:

python run_reconstruction.py --name pretrained_reconstruction_p3d --dataset p3d --optimize_z0 --batch_size 10 --generate_pseudogt

Through this, we replace a cache directory, which contains pre-computed statistics for the evaluation of Frechet Inception Distances, poses and images metadata, and the Pseudo-ground-truths for each image.

Mesh generator training from scratch

Set up the Pseudo-ground-truth data as described in the section above, then execute the following command:

python main.py --name cub_512x512_class --conditional_class --dataset cub --gpu_ids 0,1,2,3 --batch_size 32 --epochs 1000 --tensorboard

Here, we train a CUB birds model, conditioned on class labels, for 1000 epochs. Every 20 epochs, we have FID evaluations (which can be changed with --evaluate_freq). Usage of different numbers of GPUs can produce slightly different results. Tensorboard allows us to export the results in Tensorboard's log directory tensorboard_gan.

After training, we can find the best model's checkpoint with the following command:

python main.py --name cub_512x512_class --conditional_class --dataset cub --gpu_ids 0,1,2,3 --batch_size 64 --evaluate --which_epoch best

Mesh estimation model training

Use the following two commands for training from scratch:

python run_reconstruction.py --name pretrained_reconstruction_cub --dataset cub --batch_size 50 --tensorboard
python run_reconstruction.py --name pretrained_reconstruction_p3d --dataset p3d --optimize_z0 --batch_size 50 --tensorboard

Tensorboard log files are saved in tensorboard_recon.

License

MIT

Acknowledgment

This idea has been built based on the architecture of Insafutdinov & Dosovitskiy.
Poisson Surface Reconstruction was used for Point Cloud to 3D Mesh transformation.
The GAN architecture (used for texture mapping) is a mixture of Xian's TextureGAN and Li's GAN.