tesseract topic
PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
dlp-pdf-redaction
This solution provides an automated, serverless way to redact sensitive data from PDF files using Google Cloud Services like Data Loss Prevention (DLP), Cloud Workflows, and Cloud Run.
wordscapes-bot
python bot that plays wordscapes via scrcpy, pyautogui
wagtail_textract
Text extraction for Wagtail document search
CopyTextFromVideo
Mutlithreaded script to copy text from video file or camera based on OpenCV for image processing and Tesseract OCR to text recognition. C++ video text recognition.
PyraDox
PyraDox is a python tool which helps in document digitization by extracting text information and masking of personal information with the help of Tesseract-ocr.
tesseract-OCR-iOS-demo
This prototype is to recognize text inside the image and for that it uses Tesseract OCR. The underlying Tesseract engine will process the picture and return anything that it believes is text.