python-documentai-toolbox
python-documentai-toolbox copied to clipboard
Document AI Toolbox is an SDK for Python that provides utility functions for managing, manipulating, and extracting information from the document response. It creates a "wrapped" document object from...
Explore switching PDF Splitter from PikePDF to PyMuPDF See if efficiency/code readability improves https://pymupdf.readthedocs.io/en/latest/about.html
- To simplify maintenance/testing, it could be beneficial to use a File Tree Library for the GCS Utilities. Examples: https://github.com/ddddddO/gtree https://github.com/owlinux1000/gcstree
- The current default behavior makes post-processing difficult when the OCR doesn't group the table rows correctly.
Transcript of conversation with Rand Wrobel. Rand Wrobel, Jan 9, 7:36 PM Is there any index into the functions provided in the DocAI Toolbox in Git? You, Jan 9, 8:26 ...
Inspired by https://stackoverflow.com/a/77609221/6216983 The[`from_gcs()`][2] method can only create a single Wrapped Document from a single document output in GCS. It could be simpler for users if this method could output...
Hi, I have found a backward incompatibility regarding to the change included in the 2a0 version. When using the wrapper.document.py. export_hocr_str() method, the hocr output format is different to previous...
[Policy Bot](https://github.com/googleapis/repo-automation-bots/tree/main/packages/policy#policy-bot) found one or more issues with this repository. - [x] Default branch is 'main' - [ ] Branch protection is enabled - [x] Merge commits disabled - [x]...
[](https://renovatebot.com) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [argcomplete](https://togithub.com/kislyuk/argcomplete) ([changelog](https://togithub.com/kislyuk/argcomplete/blob/master/Changes.rst)) | `==3.4.0` -> `==3.5.0` |...
Here is entities example returned from splitter: ``` [text_anchor { text_segments { end_index: 1424 } } type_: "form1" confidence: 0.96 page_anchor { page_refs { } page_refs { page: 1 }...
This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [argcomplete](https://redirect.github.com/kislyuk/argcomplete) ([changelog](https://redirect.github.com/kislyuk/argcomplete/blob/master/Changes.rst)) | `==3.4.0` -> `==3.5.0` | [](https://docs.renovatebot.com/merge-confidence/) |...