textract topic

List textract repositories

doc2audiobook

203
Stars
32
Forks
Watchers

Convert text documents to high fidelity audio(books).

code4goal-resume-parser

126
Stars
65
Forks
Watchers

Solution for Code4Goal challenge

aws-pdf-textract-pipeline

159
Stars
19
Forks
Watchers

:mag: Data pipeline for crawling PDFs from the Web and transforming their contents into structured data using AWS textract. Built with AWS CDK + TypeScript

s3-ocr

111
Stars
6
Forks
Watchers

Tools for running OCR against files stored in S3

Textractor

49
Stars
9
Forks
Watchers

一个高效的从HTML中提取正文的类库。An efficient class library for extracting text from HTML.

wagtail_textract

31
Stars
13
Forks
Watchers

Text extraction for Wagtail document search

ocr-python

63
Stars
9
Forks
Watchers

OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.