py-pdf-parser icon indicating copy to clipboard operation
py-pdf-parser copied to clipboard

A Python tool to help extracting information from structured PDFs.

Results 31 py-pdf-parser issues
Sort by recently updated
recently updated
newest added

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 3.1.1 to 3.2.0. Release notes Sourced from docker/build-push-action's releases. v3.2.0 What's Changed Remove workaround for setOutput by @​crazy-max (#704) Docs: fix Git context link and add more...

dependencies
github_actions

updates: - [github.com/psf/black: 22.8.0 → 22.10.0](https://github.com/psf/black/compare/22.8.0...22.10.0) - [github.com/pre-commit/mirrors-mypy: v0.981 → v0.982](https://github.com/pre-commit/mirrors-mypy/compare/v0.981...v0.982)

Bumps [matplotlib](https://github.com/matplotlib/matplotlib) from 3.5.1 to 3.6.1. Release notes Sourced from matplotlib's releases. REL: v3.6.1 This is the first bugfix release of the 3.6.x series. This release contains several bug-fixes and...

dependencies
python

updates: - [github.com/pre-commit/pre-commit-hooks: v4.4.0 → v4.6.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.4.0...v4.6.0) - [github.com/pycqa/isort: 5.12.0 → 5.13.2](https://github.com/pycqa/isort/compare/5.12.0...5.13.2) - [github.com/psf/black: 23.7.0 → 24.4.2](https://github.com/psf/black/compare/23.7.0...24.4.2) - [github.com/pycqa/flake8: 6.1.0 → 7.1.0](https://github.com/pycqa/flake8/compare/6.1.0...7.1.0) - [github.com/pre-commit/mirrors-mypy: v1.4.1 → v1.10.1](https://github.com/pre-commit/mirrors-mypy/compare/v1.4.1...v1.10.1)

pyvoronoi is pinned to 1.0.7 which is incompatible with Python 3.11 and upwards. https://github.com/fabanc/pyvoronoi/issues/22#issuecomment-1809156090 my suggestion is to upgrade the deps or include a maximum Python version of 3.10.x in...

enhancement

The filters will now be called `filter_by_text_contains()` and `filter_by_text_equals()` with the added "s".

At the moment, one filter is called: `filter_by_text_contains()` while the other filter is called: `filter_by_text_equal()` Notably, these are similar in structure except for the final verb being in different forms....

enhancement

**Description** Added the ability to ignore case sensitivity when filtering PDF elements by text matching. Default is not to ignore (i.e. doesn't break logic with previous versions). **Linked issues** closes...

**Feature Request** When using the `filter_by_text_contains` or `filter_by_text_equals`, it would be nice to be able to have a parameter `ignore_case` which allows for caseless matching. For example, currently filtering by...

enhancement

I have a file containing several pages - see [bugfile.pdf](https://github.com/jstockwin/py-pdf-parser/files/11454761/bugfile.pdf) I now want to extract pages 2 and 3 of this file and add them into a new `PDFDocument` object....

bug
component: loaders