pdfrw icon indicating copy to clipboard operation
pdfrw copied to clipboard

How do I load an existing pdf implementation and replace strings in the document without changing the original layout

Open 2622594863 opened this issue 1 year ago • 2 comments

How do I load an existing pdf implementation and replace strings in the document without changing the original layout

2622594863 avatar Jun 13 '24 10:06 2622594863

You may be looking for something like https://github.com/JoshData/pdf-redactor.

hoehermann avatar Jun 29 '24 02:06 hoehermann

Preserving layout, a.k.a. reflow, is a non-trivial operation. Take a look at pdfrwx whose classes provide full support for stream decompression (all PDF stream filters are supported) and parsing, which transforms a PDF stream into an abstract syntax tree (AST; see examples). After that you can focus on what exactly you want to do with the text by working directly with the AST. Once the AST has been edited, you can re-encoded it as stream and save the file.

sl2c avatar Jul 20 '24 13:07 sl2c