document-analysis topic
AdverseBiNet
Improving Document Binarization via Adversarial Noise-Texture Augmentation (ICIP 2019)
Document_Layout_Analysis-MonkAI
DL models that take a document image file as input, locate the position of paragraphs, lines, images, etc. with their labels and confidence scores.
assemblyline
AssemblyLine 4: File triage and malware analysis
ViBERTgrid-PyTorch
An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents. ICDAR, 2021"
AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
UTRNet-High-Resolution-Urdu-Text-Recognition
UTRNet: High-Resolution Urdu Text Recognition In Printed Documents (ICDAR'23)
Retrieval-Augmented-Generation-Engine-with-LangChain-and-Streamlit
Powerful web application that combines Streamlit, LangChain, and Pinecone to simplify document analysis. Powered by OpenAI's GPT-3, RAG enables dynamic, interactive document conversations, making it i...
detectron2-publaynet
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
amazon-textract-transformer-pipeline
Post-process Amazon Textract results with Hugging Face transformer models for document understanding
docvisor
An open-source tool for visualisation of outputs of deep-learning models for document analysis tasks such as fully automatic, bounding box and OCR.