py-pdf-parser icon indicating copy to clipboard operation
py-pdf-parser copied to clipboard

Add custom filter predicate and header/footer filters

Open aiden2480 opened this issue 2 months ago • 0 comments

Feature Request - methods to add to ElementList

# Filter elements based on a custom filter passed into the method
ElementList.filter(predicate: Callable[[PDFElement], bool]) -> ElementList

# Filter elements based on font size alone, ignoring differences in font
ElementList.filter_by_font_size(font_size: float) -> ElementList

# Filter out elements that are contained within the page header
ElementList.filter_out_header(bottom_of_header_y: float) -> ElementList

# Filter out elements that are contained within the page footer
ElementList.filter_out_footer(top_of_footer_y: float) -> ElementList

# Return the first element in the ElementList
ElementList.first() -> PDFElement

# Return the last element in the ElementList
ElementList.last() -> PDFElement

I'd be happy to implement these myself

aiden2480 avatar May 01 '24 06:05 aiden2480