pdf-lib icon indicating copy to clipboard operation
pdf-lib copied to clipboard

Corrupted PDF

Open emilsedgh opened this issue 2 years ago • 13 comments

Hi.

This is an amazing library. Thanks a lot @Hopding. I know you've been inactive for a while but the quality of the code and the support you gave for this during your active time has been absolutely phenomenal. You don't see such fantastic support even for paid products. Good luck whatever you're up to.

My issue is this: I have this PDF file than looks like this:

Screen Shot 2021-07-31 at 9 06 17 PM

But when I open/save it using pdf-lib, it will look like this:

Screen Shot 2021-07-31 at 9 10 36 PM

Has anyone ever had a similar experience?

emilsedgh avatar Aug 01 '21 04:08 emilsedgh

test.pdf

Here is the PDF file for reference so this could be easily reproduced.

emilsedgh avatar Aug 01 '21 04:08 emilsedgh

I have added a $500 bounty for anyone who can fix this.

Not that I would consider this as fixed (for the bounty) only if this is fixed on PDF-Lib, not by changing the pdf file (eg saving/compressing it using other programs)

emilsedgh avatar Aug 01 '21 04:08 emilsedgh

@emilsedgh

image

I believe there is some non-critical error in the pdf file provided since I'm not able to run it through iText RUPS to investigate the structure

com.itextpdf.kernel.PdfException: Invalid indirect reference {0}.

I'm suspecting that the custom font is not properly embedded.

For example, the following is using ArialBold.

105 0 obj
<</V (��) /DA (/ArialBold 0 Tf 0 0 0.501961 rg) /DR 114 0 R /F 4 /FT /Tx /Rect [39.5289 469.115 139.84 480.419 ] /Subtype /Widget /T (Lease MLS) /TU (Lease MLS) /Type /Annot /MK 118 0 R /Ff 0 /M (D:20210728200742Z) /AP <</N 19 0 R >> >> 
endobj
107 1 obj
<</Length 0 /Subtype /Form /BBox [0 0 99.64 11.479 ] >> stream

endstream

endobj

FYI it is not part of the standardFont https://pdf-lib.js.org/docs/api/enums/standardfonts

Are you in control of the generation of that particular PDF File? or do you just want to modify it?

I've repaired your PDF file and provided in the following repo.

https://github.com/PhakornKiong/pdfLoadError

PhakornKiong avatar Aug 07 '21 10:08 PhakornKiong

HI @PhakornKiong. Good job at investigating. Since other pdf software are able to recover from this situation, I'd love to see a patch that'd make pdf-lib also recover from it. For example other pdf software are able to fallback to other fonts.

Unfortunately I have a series of PDF's that are already generate. My intention is to be able to use them with pdf-lib.

Thanks.

emilsedgh avatar Aug 07 '21 17:08 emilsedgh

@emilsedgh does this happen if you save the document with pdfDoc.save({ useObjectStreams: false })?

Hopding avatar Sep 22 '21 22:09 Hopding

Yes. The same thing happens although the results look slightly different.

emilsedgh avatar Sep 22 '21 22:09 emilsedgh

test.pdf

Here is the PDF file for reference so this could be easily reproduced.

Is this the corrupted PDF or the original PDF?

sparticvs avatar Sep 28 '21 12:09 sparticvs

This is the original one.

On Sep 28, 2021, at 5:28 AM, Charles Timko @.***> wrote:

 test.pdf

Here is the PDF file for reference so this could be easily reproduced.

Is this the corrupted PDF or the original PDF?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

emilsedgh avatar Sep 28 '21 16:09 emilsedgh

I also received this cryptic issue. Then I tested with the pdfLoadError tool. The individual lines didn't convince me as error handling. So I split the PDF document into the individual pages (https://www.ilovepdf.com/split_pdf) and, curiously, the split first page is now displayed correctly. So just by splitting the problem is gone. I hope @Hopding ding it helps you.

dcsline avatar Oct 07 '21 09:10 dcsline

@dcsline there is an easier solution to your problem that will come with the new release of pdf-lib ( look PR NO #986 ). Hopefully, that means that splitting your pdf is no longer needed before working with pdf-lib 😃

mohamedsalem401 avatar Oct 07 '21 09:10 mohamedsalem401

POST /#951/:0/merg e_requests

Joram3 avatar Apr 12 '23 10:04 Joram3

Merge pull request #1000  from Hopding/POST /#951/:0/merg e_requests

Joram3 avatar Apr 15 '23 03:04 Joram3

Is the issue solved or not yet.

gpugems avatar Feb 06 '24 11:02 gpugems