sparrow-donut
sparrow-donut copied to clipboard
Data extraction with Donut ML model
Sparrow Donut
Data extraction with ML
The Principle
Sparrow is an innovative open-source solution designed for efficient data extraction and processing from various documents and images. It seamlessly handles forms, invoices, receipts, and other unstructured data sources. Sparrow stands out with its modular architecture, offering independent services such as OCR, Donut fine-tuning/inference, and a data labeling UI, all optimized for robust performance.
Services
- sparrow-data - This service focuses on data preparation specifically for the Donut ML model, including fine-tuning and OCR integration.
- sparrow-ml - Dedicated to the Donut ML model, this service handles both fine-tuning and inference, streamlining the machine learning workflow.
- sparrow-ui - A user-friendly interface for managing Donut ML model data labeling services and a dashboard.
Installation
Donut
Follow the install steps outlined here:
-
Donut Data install steps
-
Donut ML install steps
-
Donut UI install steps
Usage
Donut
Follow the steps outlined here:
-
Donut Data usage steps
-
Donut ML usage steps
-
Donut UI usage steps
Examples
Inference with Donut ML model
Sparrow UI:
Author
License
Licensed under the Apache License, Version 2.0. Copyright 2020-2024 Katana ML, Andrej Baranovskij. Copy of the license.