Recommendations-Document-Image-Processing
Recommendations-Document-Image-Processing copied to clipboard
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
📖 Recommendations of Document Image Processing
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
🔥 Contents
-
1. Appearance Enhancement
- 1.1 Papers
- 1.2 Datasets
- 1.3 Apps
- 1.4 SOTA
-
2. Deshadow
- 2.1 Papers
- 2.2 Datasets
- 2.3 SOTA
-
3. Dewarping
- 3.1 Papers
- 3.2 Dataset
- 3.3 SOTA
-
4. Deblur
- 4.1 Papers
- 4.2 Datasets
- 4.3 SOTA
-
5. Binarization
- 5.1 Papers
- 5.2 Datasets
- 5.3 SOTA
- ⭐ Star Rising
1. Appearance Enhancement
Appearance enhancement (also known as illumination correction) is not limited to a specific degradation type and aims to restore a clean appearance similar to that obtained from a scanner or digital born PDF files.
1.1 Papers
1.2 Datasets
Dataset | Num. (train/test) | Type | Example | Download |
---|---|---|---|---|
Doc3DShade | 90K | Synth | Example | Link |
DocProj | 2450 | Synth | Example | Link |
DocUNet from DocAligner | 130 | Real | Example | Link |
RealDAE | 600 (450/150) | Real | Example | Link |
Inv3D | 25K | Synth | Example | Link |
1.3 Apps
1.4 SOTA
Venue | Methods | Training data | DocUNet from DocAligner (130) | RealDAE (150) | ||
---|---|---|---|---|---|---|
SSIM | PSNR | SSIM | PSNR | |||
- | - | - | 0.7195 | 13.09 | 0.8264 | 12.26 |
TOG'19 | DocProj | DocProj | 0.7098 | 14.71 | 0.8684 | 19.35 |
BMVC'20 | Das et al. | Doc3DShade | 0.7276 | 16.42 | 0.8633 | 19.87 |
MM'21 | DocTr | DocProj | 0.7067 | 15.78 | 0.7925 | 18.62 |
MM'22 | UDoc-GAN | DocProj | 0.6833 | 14.29 | 0.7558 | 16.43 |
TAI'23 | GCDRNet | RealDAE | 0.7658 | 17.09 | 0.9423 | 24.42 |
2. Deshadow
Deshadowing aims to eliminate shadows that are mainly caused by occlusion to obtain shadow-free document images.
2.1 Papers
* indicates that the implementation is unofficial.
2.2 Datasets
Dataset | Num. (train/test) | Type | Example | Download |
---|---|---|---|---|
RDD | 4916 (4371/545) | Real | Example | Link |
Kligler et al. | 300 | Real | Example | Link |
FSDSRD | 14200 | Synth | Example | Link |
Jung et al. | 87 | Real | Example | Link |
OSR | 237 | Real | Example | Link |
WEZUT OCR | 176 | Real | Example | Link |
SD7K | 7620 (6479/760) | Real | Example | Link |
SynDocDS | 50K (40K/5K) | Synth | Link |
2.3 SOTA
Venue | Method | Training data | Kligler et al. (300) | Jung et al. (87) | OSR (237) | RDD (545) | SD7K (760) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RMSE↓ | PSNR↑ | SSIM↑ | RMSE↓ | PSNR↑ | SSIM↑ | RMSE↓ | PSNR↑ | SSIM↑ | RMSE↓ | PSNR↑ | SSIM↑ | RMSE↓ | PSNR↑ | SSIM↑ | |||
CVPR'23 | BGShadowNet | RDD | 5.377 | 29.17 | 0.948 | 2.219 | 37.58 | 0.983 | |||||||||
ICCV'23 | FSENet | SD7K | 10.60 | 28.98 | 0.93 | 17.56 | 23.60 | 0.85 | 10.00 | 28.67 | 0.96 |
3. Dewarping
Dewarping, also referred to as geometric rectification, aims to rectify document images that suffer from curves, folds, crumples, perspective/affine deformation and other geometric distortions.
3.1 Papers
3.2 Dataset
Dataset | Num. | Type | Example | Download/Codes |
---|---|---|---|---|
DocUNet | 130 | Real | Example | Link |
Doc3D | 100K | Synth | - | Link |
DIW | 5K | Real | Example | Link |
WarpDoc | 1020 | Real | Example | Link |
DIR300 | 300 | Real | Example | Link |
Inv3D | 25K | Synth | Example | Link |
DICP | - | Synth | - | Link |
DIF | - | Synth | - | Link |
Simulated Paper | 90K | Synth | - | Link |
DocReal | 200 | Real | Example | Link |
UVDoc | 20K | Synth | Link |
3.3 SOTA
Venue | Method | DocUNet (130) | DIR300 (300) | DocReal (200) | ||||||
---|---|---|---|---|---|---|---|---|---|---|
MS-SSIM↑ | LD↓ | AD↓ | MS-SSIM↑ | LD↓ | AD↓ | MS-SSIM↑ | LD↓ | |||
ICCV'19 | DewarpNet | 0.474 | 8.39 | 0.426 | 0.492 | 13.94 | 0.331 | |||
DAS'20 | FCN-based | 0.448 | 7.84 | 0.434 | 0.503 | 9.75 | 0.331 | |||
ICCV'21 | Piece-Wise | 0.492 | 8.64 | 0.468 | ||||||
ICDAR'21 | DDCP | 0.473 | 8.99 | 0.453 | 0.552 | 10.95 | 0.357 | 0.46 | 16.04 | |
MM'21 | DocTr | 0.511 | 7.76 | 0.396 | 0.616 | 7.21 | 0.254 | 0.55 | 12.66 | |
CVPR'22 | RDGR | 0.497 | 8.51 | 0.461 | ||||||
MM'22 | Marior | 0.478 | 7.27 | 0.403 | ||||||
ECCV'22 | DocGeoNet | 0.504 | 7.71 | 0.380 | 0.638 | 6.40 | 0.242 | 0.55 | 12.22 | |
SIGGRAPH'22 | PaperEdge | 0.473 | 7.81 | 0.392 | 0.583 | 8.00 | 0.255 | 0.52 | 11.46 | |
Arxiv'22 | DocScanner-L | 0.518 | 7.45 | 0.334 | ||||||
ICCV'23 | Li et al. | 0.497 | 8.43 | 0.376 | 0.607 | 7.68 | 0.244 | |||
WACV'23 | DocReal | 0.50 | 7.03 | 0.56 | 9.83 | |||||
TCSVT'23 | DRNet | 0.51 | 7.42 | |||||||
TMM'23 | DocTr++ | 0.51 | 7.54 | 0.45 | 19.88 | |||||
Arxiv'23 | Polar-Doc | 0.605 | 7.17 | 0.206 | ||||||
Arxiv'23 | MetaDoc | 0.502 | 7.42 | 0.315 | 0.638 | 5.75 | 0.178 | |||
SIGGRAPH'23 | UVDoc | 0.544 | 6.83 | 0.315 | ||||||
ACM TOG'23 | LA-DocFlatten | 0.526 | 6.72 | 0.300 | 0.651 | 5.70 | 0.195 |
Note that the 127th and 128th distorted images in DocUNet benchmark are rotated by 180 degrees, which do not match the ground truth documents. The performance reported here is based on corrected data.
4. Deblur
4.1 Papers
4.2 Datasets
Dataset | Num. (train/test) | Type | Example | Download |
---|---|---|---|---|
TDD (text deblur dataset) | 67.6K (66K/1.6K) | Synth | Example | Link |
4.3 SOTA
Comding Soon ...
5. Binarization
5.1 Papers
5.2 Datasets
Dataset | Num. | Type | Example | Download |
---|---|---|---|---|
DocEng 2019 | 15 | Real | Example | Link |
DocEng 2020 | 32 | Real | Example | Link |
DocEng 2021 | 222 | Real | Example | Link |
DocEng 2022 | 80 | Real | Example | Link |
DIBCO 2009 | 10 | Real | Example | Link |
H-DIBCO 2010 | 10 | Real | Example | Link |
DIBCO 2011 | 16 | Real | Example | Link |
H-DIBCO 2012 | 14 | Real | Example | Link |
DIBCO 2013 | 16 | Real | Example | Link |
H-DIBCO 2014 | 10 | Real | Example | Link |
H-DIBCO 2016 | 10 | Real | Example | Link |
DIBCO 2017 | 20 | Real | Example | Link |
DIBCO 2018 | 10 | Real | Example | Link |
DIBCO 2019 | 10 | Real | Example | Link |
Bickly-diary | 7 | Real | Example | Link |
Synchromedia Multispectral (MSI) | 240 | Real | Example | Link |
Persian Heritage Image Binarization (PHIBD) | 15 | Real | Example | Link |
Palm Leaf | 50 | Real | Example | Link |
NoiseOffice | 216 | Synth | Example | Link |
LRDE Document Binarization Dataset | 125 | Real | - | Link |
Shipping label dataset | 1082 | Real | Example | Link |
5.3 SOTA
Coming Soon ...