py-pdf-parser
py-pdf-parser copied to clipboard
A Python tool to help extracting information from structured PDFs.
Bumps [wand](https://github.com/emcconville/wand) from 0.6.9 to 0.6.10. Release notes Sourced from wand's releases. Wand 0.6.10 The 0.6.10 release is an immediate patch release to address additional segmentation faults, and Apple M1...
Bumps [ddt](https://github.com/datadriventests/ddt) from 1.5.0 to 1.6.0. Release notes Sourced from ddt's releases. 1.6.0 What's Changed Moved @named_data into main ddt.py module so it can be imported. by @orgadish in datadriventests/ddt#109...
Bumps [matplotlib](https://github.com/matplotlib/matplotlib) from 3.5.1 to 3.5.3. Release notes Sourced from matplotlib's releases. REL: v3.5.3 This is the third bugfix release of the 3.5.x series. This release contains several bug-fixes and...
**Bug Report** `extract_table` re-orders the table rows by the `y` axis (top to bottom), which works for most cases. The issue comes if we have a table with a header...
https://github.com/jstockwin/py-pdf-parser/pull/218 adds a single proof of concept test for the visualise tool. We should test it more thoroughly.
#89 should have had an example added to the documentation. We already have an example that runs through a variety of tables, including ones which go over the page. We...
We should add some code coverage checks to the CI. Initially this will help us find untested areas (note the visualise tool is currently untested as we're not really sure...
You can pass `show_info=True` to the visualise tool, and this allows you to click on elements and see details etc. It is unfinished and needs work. - The visuals need...
We [currently](https://github.com/jstockwin/py-pdf-parser/blob/master/py_pdf_parser/components.py#L173) call `.height`. However, from https://github.com/pdfminer/pdfminer.six/issues/202 it looks as though `LTChar` has a `size` attribute. We should use this instead. That said, we should check the PDFMiner code and...
Bumps [shapely](https://github.com/shapely/shapely) from 1.8.2 to 1.8.5.post1. Release notes Sourced from shapely's releases. 1.8.5.post1 No release notes provided. 1.8.5 Packaging Python 3.11 wheels have been added to the matrix for all...