hyperlinks have negative height
Describe the bug
hyperlink height property has negative height value.
Code to reproduce the problem
- open pdf
- see pdf_file.pages[61].hyperlinks
PDF file
https://www.singtel.com/content/dam/singtel/about-us/sustainability/reports/Singtel-Group-Sustainability-Report-2022.pdf
Expected behavior
height should be positive number
Actual behavior
height has negative value
Screenshots

Environment
- pdfplumber version: 0.7.5
- Python version: 3.10.5
- OS: Ubuntu 20.04.5 LTS (Focal Fossa)
Additional context
in addition we can see that "top" and "bottom" attributes are swapped, that doesn't comply with pdfplumber's bounding box definitions as discussed in https://github.com/jsvine/pdfplumber/issues/198
Hi @bentsi, thanks for sharing this example. The height, top, and bottom attributes are all calculated from the raw annotation's Rect (bounding box), specified by the PDF in a direct command.
In this particular PDF (as observed by opening it in a text editor), that Rect command is Rect[428.053 634.536 453.041 626.144], which corresponds to exactly what you see for x0, y0, x1, y1 in your screenshot above, suggesting that pdfplumber is collecting the correct information.
Given that, there would seem to be two main options:
-
Do nothing, on the principle that
pdfplumbershould focus on PDF objects' actual (i.e., as coded) attributes, rather than what we think the author intended. -
When
pdfplumbersees an annotation that uses a bounding box that suggests a negative height, "fix" the bounding box (probably by flipping the vertical coordinates) so that it has a positive height.
My inclination is toward the first option, because trying to fix PDF-creator's mistakes seems like opening a can of worms. But I'm open to suggestions otherwise.