pdf2json icon indicating copy to clipboard operation
pdf2json copied to clipboard

'w' doesn't make sense

Open SPlatten opened this issue 8 years ago • 5 comments
trafficstars

Looking at a specific text object, the width makes no sense at all, the width returned for the text:

LINE%20START

is 55.044, why and how, the font size is 10pt and the page width is only 37.188, so what and how is 'w' calculated?

SPlatten avatar Jul 09 '17 12:07 SPlatten

I also would like to understand the relationship between x and y position values and the width units for text. This would be particularly useful in determining when to merge text. For example, x + w should give me the x for the adjacent text.

aeyrium avatar Nov 27 '17 17:11 aeyrium

I ended up using text.w/2 instead of just text.w to have consistent text width compared to other fills and lines coordinates. so far it's working.

RomainHautefeuille avatar Aug 09 '18 15:08 RomainHautefeuille

Has anyone worked this out yet? I too would like the 'width' of the text component in a sane format so I can build a bounding box around the text object.

mattsoftware avatar Jul 24 '20 05:07 mattsoftware

I managed to figure this out using some pdfs I created with characters printed randomly on the page.

'w' appears to be in points. x, y, page width and page height are all in Page Units, but w for some reasons, was built up in points.

It also randomly appeared to me that converting Page Units to points, was simply multiplying Page Units by 16.

I tested on a few random PDFs, and it ended up being accurate.

wvanrensburg-zywave avatar May 17 '22 15:05 wvanrensburg-zywave

I managed to figure this out using some pdfs I created with characters printed randomly on the page.

'w' appears to be in points. x, y, page width and page height are all in Page Units, but w for some reasons, was built up in points.

It also randomly appeared to me that converting Page Units to points, was simply multiplying Page Units by 16.

I tested on a few random PDFs, and it ended up being accurate.

This appears to be correct. It worked for me too.

austenstrine avatar Jan 01 '24 19:01 austenstrine