DocIIW icon indicating copy to clipboard operation
DocIIW copied to clipboard

Repository for Intrinsic Decomposition of Document Images In-the-Wild (BMVC '20)

DocIIW

Repository for the paper "Intrinsic Decomposition of Document Images In-the-Wild" (BMVC '20)

Quick Links: PDF | arXiv | Talk | Supplementary

Updates

  • Sep 5th, 2020: Initial data is released (90K images).
  • Mar 20th, 2021: Evaluation images are released.
  • Nov 8th, 2022: Training Code and models.
  • Coming Soon: Training details.

Doc3DShade

Doc3DShade extends Doc3D with realistic lighting and shading. Follows a similar synthetic rendering procedure using captured document 3D shapes but final image generation step combines real shading of different types of paper materials under numerous illumination conditions.
Following figure illustrates the image generation pipeline: Dataset Capture Pipeline

Following figure shows a side-by-side comparison of images in Doc3DShade and Doc3D: Comparison with Doc3D

Data Download Instructions

Doc3Dshade contains 90K images, 80K used for training and 10K for validation. Split used in the paper: train, val

  • Download the input images from img.zip .
  • Download the white-balanced images from wbl.zip .
  • Download synthetic textures from alb.zip .

Training Instructions

  • Upcoming

Pre-trained Models

Evaluation Images and Results

  • Real test images are given in: /testimgs/real
  • Shading removed real test images:
  • Shading removed DocUNet [1] images are available at:
  • Shading removed and unwarped [2] DocUNet [1] images are available at:

Citation:

If you use the dataset, please consider citing our work-

@inproceedings{DasDocIIW20,
  author    = {Sagnik Das, Hassan Ahmed Sial, Ke Ma, Ramon Baldrich, Maria Vanrell and Dimitris Samaras},
  title     = {Intrinsic Decomposition of Document Images In-the-Wild},
  booktitle = {31st British Machine Vision Conference 2020, {BMVC} 2020, Manchester, UK, September 7-10, 2020},
  publisher = {{BMVA} Press},
  year      = {2020},
}

References:

[1] DocUNet: https://www3.cs.stonybrook.edu/~cvl/docunet.html

[2] DewarpNet: https://sagniklp.github.io/dewarpnet-webpage/