alto-xml topic
PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
DocumentLayoutAnalysis
Document Layout Analysis resources repos for development with PdfPig.
kraken
OCR engine for all the languages
kitodo-presentation
Kitodo.Presentation is a feature-rich framework for building a METS- or IIIF-based digital library. It is part of the Kitodo Digital Library Suite.
EN-data_mining
Data Mining Historical Newspaper Metadata (METS/ALTO formats)
Image_Retrieval
Image Retrieval in Digital Libraries - A Multicollection Experimentation of Machine Learning techniques
mirador-textoverlay
Text Overlay plugin for Mirador 3
alto-tools
Python tools for performing various operations on ALTO XML files
ocr-conversion
Conversions between various OCR formats