Peter Williams comments

Results 13 comments of


Peter Williams

Vectorized PDF text and object extraction

We have some PDF full-text search code that is unfortunately in a private repo. The idea behind it is simple. We keep track of the `textMark`s that were used to...

Validation Standards PDF/A

Hi, PaperCut customers need PDF/A compliance. Is anyone else planning to work on this? If not then I can make a proposal.

Advanced text extraction on columns, tables, equations

Ben, I will investigate this. I have am working on a few versions of table extraction code that I have not submitted yet. They address most/all of the issues you...

Advanced text extraction on columns, tables, equations

Thanks. That will give me a benchmark to compare against.

[PROBLEM] ExtractText() Behaviour differences in new version( 3.0.3 to 3.14.0)

Hi @cyberlord29 It looks like my table extraction code changes caused this problem. Those changes improve extraction of many other types of tables. We can give you a better experience...

[PROBLEM] ExtractText() Behaviour differences in new version( 3.0.3 to 3.14.0)

peter.wi > @peterwilliams97 That sounds great , can you leave an email Id here so I can send it to you ? [email protected]

[PROBLEM] ExtractText() Behaviour differences in new version( 3.0.3 to 3.14.0)

Hi Maneesh Sorry for the late reply. This is my email. ---------------------------------------------- Peter Williams 0488 783 700 / +61 488 783 700 On Thu, Nov 26, 2020 at 9:40 AM...

PDF image file processing issue: bad RST marker error

Coincidentally, I am away from PDF this month working on image processing. Which Go JPEG decoder(s) do you recommend?

Text redaction support

This is a useful feature. I hope to find time to build a prototype on top of PageText. As noted above, this will require references from the TextMarks back to...

Text redaction support

Part 2 requires adding links from the extracted text back to the content stream during text extraction. It's the same principle as the links from extracted text back to bounding...