layout-analysis topic

List layout-analysis repositories

PdfPig

1.5k
Stars
220
Forks
Watchers

Read and extract text and other content from PDFs in C# (port of PDFBox)

DocumentLayoutAnalysis

530
Stars
59
Forks
Watchers

Document Layout Analysis resources repos for development with PdfPig.

layout-parser

4.6k
Stars
442
Forks
Watchers

A Unified Toolkit for Deep Learning Based Document Image Analysis

kraken

661
Stars
122
Forks
Watchers

OCR engine for all the languages

PdfPigMLNetBlockClassifier

22
Stars
6
Forks
Watchers

Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and...

PDFSegmenter

19
Stars
3
Forks
Watchers

This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.

detectron2-publaynet

46
Stars
6
Forks
Watchers

Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset

publaynet-models

21
Stars
1
Forks
Watchers

Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset

SelfDocSeg

30
Stars
2
Forks
Watchers

[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)

HJDataset

28
Stars
4
Forks
Watchers

A Large Dataset of Historical Japanese Documents with Complex Layouts