pdfparser icon indicating copy to clipboard operation
pdfparser copied to clipboard

Invalid PDF data: missing %PDF header

Open LiThaM opened this issue 3 years ago • 10 comments

Hi, I change to word to PDF and receive error.

"Invalid PDF data: missing %PDF header."

PDF open normal not a problem, I attach the pdf document

2-Prueba_Word.pdf

LiThaM avatar Dec 14 '21 22:12 LiThaM

Hi @LiThaM, just to make sure I understand. You generate a PDF using MS Word and when opening it with PDFParser you receive the error Invalid PDF data: missing %PDF header.?

k00ni avatar Dec 15 '21 08:12 k00ni

Yes, correctly. I also think there must be some bug in the phpword, because it totally loses the style.

El mié, 15 dic 2021 a las 9:42, Konrad Abicht @.***>) escribió:

Hi @LiThaM https://github.com/LiThaM, just to make sure I understand. You generate a PDF using MS Word and when opening it with PDFParser you receive the error Invalid PDF data: missing %PDF header.?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/smalot/pdfparser/issues/497#issuecomment-994499155, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALREK4TY4ICEQNKU35TD7TURBIGRANCNFSM5KCE5WEA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

LiThaM avatar Dec 15 '21 10:12 LiThaM

The PDF you mentioned may be added to our test environment. Is it free of charge and without any obligations?

k00ni avatar Dec 16 '21 08:12 k00ni

Yes, that is correct. The pdf is a test that I generate to check why this error.

El jue, 16 dic 2021 a las 9:19, Konrad Abicht @.***>) escribió:

The PDF you mentioned may be added to our test environment. Is it free of charge and without any obligations?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/smalot/pdfparser/issues/497#issuecomment-995541778, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALREK56SWSRSEQLXZPFIDTURGOHZANCNFSM5KCE5WEA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

LiThaM avatar Dec 16 '21 09:12 LiThaM

I can't replicate this issue with the pdf you provided. Could you share the code you are using?

rubenvanerk avatar Jan 19 '22 07:01 rubenvanerk

Friends, thanks for everything, the error was in the PDF converter. Solving that problem, I never jumped again.

El mié, 19 ene 2022 a las 8:55, Ruben van Erk @.***>) escribió:

I can't replicate this issue. Could you share the code you are using?

— Reply to this email directly, view it on GitHub https://github.com/smalot/pdfparser/issues/497#issuecomment-1016172890, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALREK5ER4PI25CERAKKW4DUWZVAJANCNFSM5KCE5WEA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

LiThaM avatar Jan 19 '22 09:01 LiThaM

Thank you for your feedback. All the best.

k00ni avatar Jan 19 '22 11:01 k00ni

Hi Everyone. I'm facing below **"Exception - Invalid PDF data: missing %PDF header.

More information about this error

×Debug info: Error code: generalexceptionmessage ×Stack trace: line 887 of \lib\Pdfparser\vendor\smalot\pdfparser\src\Smalot\PdfParser\RawData\RawDataParser.php: Exception thrown line 102 of \lib\Pdfparser\vendor\smalot\pdfparser\src\Smalot\PdfParser\Parser.php: call to Smalot\PdfParser\RawData\RawDataParser->parseData() line 90 of \lib\Pdfparser\vendor\smalot\pdfparser\src\Smalot\PdfParser\Parser.php: call to Smalot\PdfParser\Parser->parseContent() line 330 of \user\profile\field\file\field.class.php: call to Smalot\PdfParser\Parser->parseFile() line 714 of \user\profile\lib.php: call to profile_field_file->edit_save_data() line 270 of \user\editadvanced.php: call to profile_save_data()"**

can anyone help me to fix this? I've added PdfParser library to moodle, there i'm getting the above mentioned error. Outside the moodle, I'm able to parse the same pdf without any error.

Rajashekhar-Kategoud avatar Mar 20 '23 09:03 Rajashekhar-Kategoud

I have this issue, because i use library openoffice for generate to PDF. i change me code and dont have more errores.

LiThaM avatar Mar 20 '23 10:03 LiThaM

I also have this issue with the PDFs I received from the https://ocr.space/ API. I have added a demo pdf. demo.pdf

tdwesten avatar Sep 21 '24 08:09 tdwesten

This is an issue with the format of the PDF file, i.e. some of the expected metadata is missing from the PDFs. If your PDF file is corrupted or formatted wrong, sometimes the PDF can be repaired. In my case, the files I was using were corrupted.

I'm not familiar with this code base, does it make sense to change this to where slightly incorrect formatting is not a fatal error?

svenfulen avatar Oct 19 '24 23:10 svenfulen