py-pdf-parser
py-pdf-parser copied to clipboard
Add custom filter predicate and header/footer filters
Feature Request - methods to add to ElementList
# Filter elements based on a custom filter passed into the method
ElementList.filter(predicate: Callable[[PDFElement], bool]) -> ElementList
# Filter elements based on font size alone, ignoring differences in font
ElementList.filter_by_font_size(font_size: float) -> ElementList
# Filter out elements that are contained within the page header
ElementList.filter_out_header(bottom_of_header_y: float) -> ElementList
# Filter out elements that are contained within the page footer
ElementList.filter_out_footer(top_of_footer_y: float) -> ElementList
# Return the first element in the ElementList
ElementList.first() -> PDFElement
# Return the last element in the ElementList
ElementList.last() -> PDFElement
I'd be happy to implement these myself