Scene-Text-Understanding icon indicating copy to clipboard operation
Scene-Text-Understanding copied to clipboard

OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding

Survey

  • [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper
  • [2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends paper
  • [2020-Arxiv] Text Recognition in the Wild: A surveypaper

Scene Text Detection

  • [2019-CVPR] Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation [paper]
  • [2019-CVPR] A Multitask Network for Localization and Recognition of Text in Images(end-to-end) [paper]
  • [2019-CVPR] AFDM: Handwriting Recognition in Low-resource Scripts using Adversarial Learning(data augmentation) [paper] [code]
  • [2019-CVPR] CRAFT: Character Region Awareness for Text Detection [paper] [code]
  • [2019-CVPR] Data Extraction from Charts via Single Deep Neural Network(*) [paper]
  • [2019-CVPR] E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text [paper]
  • [2019-arXiv] FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition [paper]
  • [2019-CVPR] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes [paper]
  • [2019-CVPR] PSENET: Shape Robust Text Detection with Progressive Scale Expansion Network [paper][tensorflow][Pytorch]
  • [2019-CVPR] PMTD: Pyramid Mask Text Detector [paper] [code]
  • [2019-CVPR] Spatial Fusion GAN for Image Synthesis (word Synthesis) [[paper]](https://arxiv.org/abs/1812.05840 [code]
  • [2019-CVPR] Scene Text Detection with Supervised Pyramid Context Network [paper][keras]
  • [2019-arXiv] TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [paper] [code]
  • [2019-CVPR] Typography with Decor: Intelligent Text Style Transfer [paper] [code]
  • [2019-CVPR] TIOU: Tightness-aware Evaluation Protocol for Scene Text Detection(new Evalution tool)[paper] [code]
  • [2019-arXiv] MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition [paper] [code]
  • [2019-CVPR] Scene Text Magnifier [paper]
  • [2018-CVPR] Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks [paper]
  • [2018-ECCV] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [paper] [code]
  • [2018-AAAI] PixelLink: Detecting Scene Text via Instance Segmentation [paper] [code]
  • [2018-CVPR] RRPN: Arbitrary-Oriented Scene Text Detection via Rotation Proposals [paper] [code]
  • [2018-CPVR] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation [Paper]
  • [2018-arxiv] PixelLink: Detecting Scene Text via Instance Segmentation [Paper]
  • [2018-AAAI] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition [Paper]
  • [2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector[Paper]
  • [2017-arxiv] Attention-based Extraction of Structured [Paper]
  • [2017-ICCV]Single Shot TextDetector with Regional Attention [Paper]
  • [2017-ICCV]WordSup: Exploiting Word Annotations for Character based Text Detection [Paper]
  • [2017-arXiv]R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection[Paper]
  • [2017-CVPR]EAST: An Efficient and Accurate Scene Text Detector [Paper] [Code]
  • [2017-arXiv]Cascaded Segmentation-Detection Networks for Word-Level Text Spotting[Paper]
  • [2017-arXiv]Deep Direct Regression for Multi-Oriented Scene Text Detection [Paper]
  • [2017-CVPR]Detecting oriented text in natural images by linking segments [Paper]
  • [2017-CVPR]Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [Paper]
  • [2017-arXiv]Arbitrary-Oriented Scene Text Detection via Rotation Proposals [Paper]
  • [2017-AAAI]TextBoxes: A Fast Text Detector with a Single Deep Neural Network[Paper][Code]
  • [2016-arXiv]Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork [Paper]
  • [2016-arXiv]DeepText : A Unified Framework for Text Proposal Generation and Text Detectionin Natural Images [Paper] [Data]
  • [2017-PR]TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [paper] [code]
  • [2016-arXiv] Scene Text Detection via Holistic, Multi-Channel Prediction [Paper]
  • [2016-CVPR] CannyText Detector: Fast and Robust Scene Text Localization Algorithm [Paper]
  • [2016-CVPR]Synthetic Data for Text Localisation in Natural Images[Paper] [Data] [Code]
  • [2016-ECCV]Detecting Text in Natural Image with Connectionist Text Proposal Network[Paper] [Demo][Code]
  • [2016-TIP]Text-Attentional Convolutional Neural Networks for Scene Text Detection[Paper]
  • [2016-IJDAR]TextCatcher: a method to detect curved and challenging text in natural scenes[Paper]
  • [2016-CVPR]Multi-oriented text detection with fully convolutional networks[Paper]
  • [2015-TPRMI]Real-time Lexicon-free Scene Text Localization and Recognition
  • [2015-CVPR]Symmetry-Based Text Line Detection in Natural Scenes
  • [2015-ICCV]FASText: Efficient unconstrained scene text detector [Paper] https://github.com/MichalBusta/FASText
  • [2015-D.PhilThesis] Deep Learning for Text Spotting [Paper]
  • [2015 ICDAR]Object Proposals for Text Extraction in the Wild [Paper] https://github.com/lluisgomez/TextProposals
  • [2014-ECCV] Deep Features for Text Spotting [Paper] https://bitbucket.org/jaderberg/eccv2014_textspotting https://bitbucket.org/jaderberg/eccv2014_textspotting http://gitxiv.com/posts/uB4y7QdD5XquEJ69c/deep-features-for-text-spotting
  • [2014-TPAMI] Word Spotting and Recognition with Embedded Attributes [Paper] http://www.cvc.uab.es/~almazan/index/projects/words-att/index.html https://github.com/almazan/watts
  • [2014-TPRMI]Robust Text Detection in Natural Scene Images
  • [2014-ECCV] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [Paper]
  • [2013-ICCV] Photo OCR: Reading Text in Uncontrolled Conditions [Paper]
  • [2012-CVPR]Real-time scene text localization and recognition [Paper]
  • [2010-CVPR]Detecting Text in Natural Scenes with Stroke Width Transform [Paper]

Scene Text Recognition

  • [2019-CVPR] ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification [paper] [code] [code]
  • [2019-CVPR] E2E-MLT: an Unconstrained End-to-End Method for Multi-Language Scene Text [paper]
  • [2018-CVPR] FOTS: Fast [paper]
  • [2017-ICCV] WeText: Scene Text Detection under Weak Supervision [Paper]
  • [2017-ICCV] Single Shot Text Detector with Regional Attention [Paper] [Code]
  • [2017-ICCV] Self-organized Text Detection with Minimal Post-processing via Border Learning [Paper]
  • [2017-ICCV] Focusing Attention: Towards Accurate Text Recognition in Natural Images [Paper]
  • [2017-ICCV] Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks [Paper]
  • [2017-CVPR] Unambiguous Text Localization and Retrieval for Cluttered Scenes [Paper]
  • [2017-ICCV] WordSup: Exploiting Word Annotations for Character based Text Detection [Paper]
  • [2017-ICCV] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework [Paper] [Code]
  • [2017-arXiv] Cascaded Segmentation-Detection Networks for Word-Level Text Spotting [Paper]
  • [2017-AAAI] Detection and Recognition of Text Embedding in Online Images via Neural Context Models [Paper] [Code]
  • [2017-arXiv] Improving Text Proposal for Scene Images with Fully Convolutional Networks [Paper]
  • [2017-AAAI] TextBoxes: A Fast TextDetector with a Single Deep Neural Network [Paper] [Code] github 代码
  • [2017-CVPR] Detecting Oriented Text in Natural Images by Linking Segments [Paper]
  • [2017-arXiv] Arbitrary-Oriented Scene Text Detection via Rotation Proposals [Paper]
  • [2017-CVPR] Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [Paper]
  • [2016-arXiv] DeepText:A Unified Framework for Text Proposal Generation and Text Detection in Natural Images [Paper]
  • [2017-arvix ] Full-Page TextRecognition : Learning Where to Start and When to Stop https://arxiv.org/pdf/1704.08628.pdf
  • [2016-AAAI]Reading Scene Text in Deep Convolutional Sequences [Paper]
  • [2016-IJCV]Reading Text in the Wild with Convolutional Neural Networks [Paper] http://zeus.robots.ox.ac.uk/textsearch/#/search/ http://www.robots.ox.ac.uk/~vgg/research/text
  • [2016-CVPR]Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [Paper]
  • [2016-CVPR] Robust Scene Text Recognition with Automatic Rectification [Paper]
  • [2016-NIPs] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data [Paper]
  • [2015-CoRR] AnEnd-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [Paper] https://github.com/bgshih/crnn
  • [2015-ICDAR]Automatic Script Identification in the Wild [Paper]
  • [2015-ICLR] Deep structured output learning for unconstrained text recognition [Paper]
  • [2014-NIPS]Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition [Paper] http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14c/ http://www.robots.ox.ac.uk/~vgg/research/text/model_release.tar.gz
  • [2014-TIP] A Unified Framework for Multi-Oriented Text Detection and Recognition
  • [2012-ICPR]End-to-End Text Recognition with Convolutional Neural Networks [Paper] http://cs.stanford.edu/people/twangcat/ICPR2012_code/SceneTextCNN_demo.tar http://ufldl.stanford.edu/housenumbers/

Phd Thesis

  • [2016-PhD Thesis] Context Modeling for Semantic Text Matching and Scene Text Detection [Paper]
  • [2015-PhD Thesis] Deep Learning for Text Spotting [Paper]
  • [2012-PhD thesis] End-to-End Text Recognition with Convolutional Neural Networks [Paper]

Text Detection

  • [2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector [Paper]

Dataset

PowerPoint Text Detection and Recognition Dataset 2017

COCO-Text (ComputerVision Group, Cornell) 2016

  • 63,686images, 173,589 text instances, 3 fine-grained text attributes.
  • Task:text location and recognition

COCO-Text API

Synthetic Data for Text Localisation in Natural Image (VGG)2016

  • 800k thousand images
  • 8 million synthetic word instances
  • download

Synthetic Word Dataset (Oxford, VGG) 2014

  • 9million images covering 90k English words
  • Task:text recognition, segmentation
  • download

IIIT 5K-Words 2012

  • 5000images from Scene Texts and born-digital (2k training and 3k testing images)
  • Eachimage is a cropped word image of scene text with case-insensitive labels
  • Task:text recognition
  • download

StanfordSynth(Stanford, AI Group) 2012

  • Small single-character images of 62 characters (0-9, a-z, A-Z)
  • Task:text recognition
  • download

MSRA Text Detection 500 Database(MSRA-TD500) 2012

  • 500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
  • Chinese,English or mixture of both
  • Task:text detection

Street View Text (SVT) 2010

  • 350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
  • Only word level bounding boxes are provided with case-insensitive labels
  • Task:text location

KAIST Scene_Text Database 2010

  • 3000 images of indoor and outdoor scenes containing text
  • Korean,English (Number), and Mixed (Korean + English + Number)
  • Task:text location, segmentation and recognition

Chars74k 2009

  • Over 74K images from natural images, as well as a set of synthetically generatedcharacters

  • Smallsingle-character images of 62 characters (0-9, a-z, A-Z)

  • Task:text recognition

  • ICDAR Benchmark Datasets

Dataset Discription Competition Paper
ICDAR 2017 42618 training images and 9837 testing images paper link
ICDAR 2015 1000 training images and 500 testing images paper link
ICDAR 2013 229 training images and 233 testing images paper link
ICDAR 2011 229 training images and 255 testing images paper link
ICDAR 2005 1001 training images and 489 testing images paper link
ICDAR 2003 181 training images and 251 testing images(word level and character level) paper link

Blogs

Online Service

Name Description
Online OCR API,Free
Free OCR API,Free
New OCR API,Free
ABBYY FineReader Online nonAPI,free

Open Resources Code

  • 本项目基于yolo3 与crnn 实现中文自然场景文字检测及识别 [code]
  • 超轻量级中文ocr,支持竖排文字识别, 支持ncnn推理 , psenet(8.5M) + crnn(6.3M) + anglenet(1.5M) 总模型仅17M [code]
  • Tesseract c++ based tools for documents analysis and OCR [code]
  • Ocropy: Python-based tools for document analysis and OCR https://github.com/tmbdev/ocropy
  • CLSTM A small implementation of LSTM networks,focused on OCR https://github.com/tmbdev/clstm
  • Convolutional Recurrent Neural Network Torch7 https://github.com/bgshih/crnn
  • Attention-OCR Visual Attention based OCR https://github.com/da03/Attention-OCR
  • Umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm https://github.com/edward-zhu/umaru
  • AKSHAYUBHAT/DeepVideoAnalytics (CTPN+CRNN) code
  • ankush-me/SynthText code
  • JarveeLee/SynthText_Chinese_version code

Hand Writing Recognition

  • [2016-arXiv]Drawingand Recognizing Chinese Characters with Recurrent Neural Network https://arxiv.org/abs/1606.06539
  • Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition https://arxiv.org/abs/1610.02616
  • Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition https://arxiv.org/abs/1610.04057
  • High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps http://arxiv.org/abs/1505.04925">
  • DeepHCCR:Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel) https://github.com/chongyangtao/DeepHCCR">
  • Scan,Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTMAttention http://arxiv.org/abs/1604.03286
  • MLPaint:the Real-Time Handwritten Digit Recognizer http://blog.mldb.ai/blog/posts/2016/09/mlpaint/
  • caffe-ocr: OCR with caffe deep learning framework https://github.com/pannous/caffe-ocr

Licence Tag Recognition

  • ReadingCar License Plates Using Deep Convolutional Neural Networks and LSTMs
  • Numberplate recognition with Tensorflow http://matthewearl.github.io/2016/05/06/cnn-anpr/
  • end-to-end-for-plate-recognition href="https://github.com/szad670401/end-to-end-for-chinese-plate-recognitionbhttp://rnd.azoft.com/applying-ocr-technology-receipt-recognition/