document-analysis topic

List document-analysis repositories

PdfPig

1.5k
Stars
220
Forks
Watchers

Read and extract text and other content from PDFs in C# (port of PDFBox)

awesome-document-understanding

1.2k
Stars
133
Forks
Watchers

A curated list of resources for Document Understanding (DU) topic

Curve-Text-Detector

633
Stars
155
Forks
Watchers

This repository provides train&test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.

PICK-pytorch

546
Stars
190
Forks
Watchers

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

pandora

240
Stars
36
Forks
Watchers

Pandora is an analysis framework to discover if a file is suspicious and conveniently show the results

docExtractor

82
Stars
9
Forks
Watchers

(ICFHR 2020 oral) Code for "docExtractor: An off-the-shelf historical document element extraction" paper

LiLT

326
Stars
39
Forks
Watchers

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)