James Healy comments

Results 139 comments of


                                            James Healy

Ignore pdf footer while reading

Given the changes starts in 2.1.0, I'd guess it might be a result of this commit a8ca5dc

Getting unreadable data (UTF-8 squares 50% of the time)

In my experience pdf-reader does a reasonable (but not perfect) text extraction from the majority of PDFs, but it does depend on the source files. For the 50% where it...

Getting unreadable data (UTF-8 squares 50% of the time)

Hi @scottybigo. I downloaded all three files and tested text extraction with pdf-reader like this: ``` $ ruby -Ilib bin/pdf_text ~/downloads/4500067854.pdf $ ruby -Ilib bin/pdf_text ~/downloads/23781.pdf $ ruby -Ilib bin/pdf_text...

Does pdf-reader manage tagged PDF ?

I believe pdf-reader will provide access to the tagged data, but it's pretty low level. For example, the high-ish level `Page#text` method ignore tags, but the low-level `Page#walk_contents` method should...

PDF titles ending w/ NULL character

> Is this a bug in ImageMagick or in pdf-reader? It seems like an encoding issue. Unfortunately it's hard to say without looking at the PDF. Are you able to...

PDF titles ending w/ NULL character

On the trailing Null character: I think I'm inclined to leave it in. I can see in the PDFs that the null character is included and unlike in C there's...

wrong table parsing output

It'd be interesting to know if this is still an issue in v2.9.0 - there's been a number of fixes to glyph positioning calculations in the last few versions. If...

Not detected all the embedded fonts, only sometimes

pdf-reader is capable of detecting all fonts in PDF, but that example isn't as robust as it could be and will need expanding for most real-world systems. When you run...

page_count undefined method `[]' for nil:NilClass

Thanks for the report. To understand the cause I'd really have to see the problem PDF. Are you able to share it with me via email ([email protected]'d.au)? On 18/01/2013 6:18...

page_count undefined method `[]' for nil:NilClass

Damn you autocorrect. My address is [email protected] On 20/01/2013 12:57 AM, "Sands Fish" [email protected] wrote: > James, does your email address have a single-quote character in it? > Doesn't like...