doctr
doctr copied to clipboard
Classify blocks as handwritten or printed text
🚀 The feature
It would be good to have a classifier internally which can distinguish blocks as handwritten or printed text. This is useful because most of the documents that I have seen has both the elements and while sometimes handwritten is important that the treatment of handwritten data is done differently than printed text data.
Motivation, pitch
I am working on extracting text from medical documents which has some parts such as hospital and doctor details as printed text and medicine name and dosage as hand written text. I would want to be able to treat them differently while running text extraction.
Alternatives
No response
Additional context
https://github.com/awslabs/handwritten-text-recognition-for-apache-mxnet
No response
We are internally working on handwritten text recognition, it could indeed be useful to work as well on a classifier to determine the type of the text above the text recognition model.
Did anyone make any progress on this feature?
@charlesmindee we're keenly interested in the handwritten text recognition feature as well. Has there been any progress or can you provide an estimated timeline?