Transformer-OCR icon indicating copy to clipboard operation
Transformer-OCR copied to clipboard

You should carefully read the SVTP images.

Open delveintodetail opened this issue 4 years ago • 3 comments

SVTP has 645 images, if you carefully read each one, you will find there are around 20 images humans cannot recognize. How can this algorithm get 98.6%??

When seeing such so supervised improvement to the SOTA, you should carefully check your code.

delveintodetail avatar Mar 23 '20 02:03 delveintodetail

SVTP has 645 images, if you carefully read each one, you will find there are around 20 images humans cannot recognize. How can this algorithm get 98.6%??

When seeing such so supervised improvement to the SOTA, you should carefully check your code.

Thank you for your reminding. Maybe there is something wrong with my code. I'll check it carefully sometime. As I am temporarily engaged in work, I am not very rich in time. There may be a delay. Sorry.

fengxinjie avatar Mar 23 '20 02:03 fengxinjie

There are several papers that adopt transformer on OCR.

  1. NRTR: A no-recurrence sequence-to-sequence model for scene text recognition
  2. MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
  3. A Simple and Strong Convolutional-Attention Network for Scene Text Recognition

delveintodetail avatar Mar 23 '20 02:03 delveintodetail

There are several papers that adopt transformer on OCR.

  1. NRTR: A no-recurrence sequence-to-sequence model for scene text recognition
  2. MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
  3. A Simple and Strong Convolutional-Attention Network for Scene Text Recognition

Thanks very much, I will read later.

fengxinjie avatar Mar 23 '20 02:03 fengxinjie