stereo_matching icon indicating copy to clipboard operation
stereo_matching copied to clipboard

Tensorflow implementation of "Efficient Deep Learning for Stereo Matching"

stereo_matching

This is a Tensorflow re-implementation of Luo, W., & Schwing, A. G. (n.d.). Efficient Deep Learning for Stereo Matching. (https://www.cs.toronto.edu/~urtasun/publications/luo_etal_cvpr16.pdf)

To run

Setup data folders

data
└───kitti_2015
    │─── training
         |───image_2
             |───000000_10.png
             |───000001_10.png
             |─── ...
         |───image_3
         |───disp_noc_0
         |─── ...
    │─── testing
         |───image_2
         |───image_3

Start training

python main.py --dataset kitti_2015 --patch-size 37 --disparity-range 201

Results

  • After training for 40k iterations.
  • Qualitative results on validation set.
  • 3-pixel error evaluation on validation set.

KITTI 2015 Stereo

Example input images

Disparity Ground-truth

Example input patches

Qualitative results

Post-processing
  • Cost-aggregation

Without cost-aggregation

With cost-aggregation

A closer look to observe the smoothing of predictions, without cost aggregation and with respectively:

Quantitative results

  • To compare with results reported in paper, look at Table-5, column Ours(37).

    3-pixel error (%)
    baseline (paper) 7.13
    baseline (re-implementation) 7.271
    baseline + CA (paper) 6.58
    baseline + CA (re-implementation) 6.527

KITTI 2012 Stereo

Qualitative results

Possible next steps

  • [ ] Implement post processing to smoothen output.
  • [ ] Look into error metrics and do quantitative analysis.
  • [ ] Run inference on test video sequences.
  • [x] Instead of the batch matrix multiplication during inference, which constructs a B x H x W x W tensor, use a loop to compute cost volume over the disparity range. Tensorflow VM might figure out that it should parallelise operations over the loop.