PdfSharpCore icon indicating copy to clipboard operation
PdfSharpCore copied to clipboard

PdfSharpCore.Pdf.IO.PdfReaderException: Unexpected character '0x0069' in PDF stream.

Open fionik opened this issue 5 years ago • 5 comments

I am attempting to open a PDF file using PDFSharp and getting the following exception:

PdfSharpCore.Pdf.IO.PdfReaderException: 'Unexpected character '0x0069' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file.'

JLLAP_14022018101618_EMAIL_14022018101618_99_00005_001.pdf

Exception stack:

Unhandled exception. PdfSharpCore.Pdf.IO.PdfReaderException: Unexpected character '0x0069' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file. at PdfSharpCore.Internal.ParserDiagnostics.ThrowParserException(String message) at PdfSharpCore.Internal.ParserDiagnostics.HandleUnexpectedCharacter(Char ch) at PdfSharpCore.Pdf.IO.Lexer.ScanLiteralString() at PdfSharpCore.Pdf.IO.Lexer.ScanNextToken() at PdfSharpCore.Pdf.IO.Parser.ParseObject(Symbol stop) at PdfSharpCore.Pdf.IO.Parser.ReadDictionary(PdfDictionary dict, Boolean includeReferences) at PdfSharpCore.Pdf.IO.Parser.ReadObject(PdfObject pdfObject, PdfObjectID objectID, Boolean includeReferences, Boolean fromObjecStream) at PdfSharpCore.Pdf.IO.PdfReader.Open(Stream stream, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider passwordProvider) at PdfSharpCore.Pdf.IO.PdfReader.Open(String path, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider provider) at PdfSharpCore.Pdf.IO.PdfReader.Open(String path) at PDFSharpTest.Program.Main(String[] args) in C:\Users\Andrew\Documents\Visual Studio 2019\Projects\PDFSharpTest\PDFSharpTest\Program.cs:line 11

This can be reproduced with a trivial program

using System;
using PdfSharpCore.Pdf;
using PdfSharpCore.Pdf.IO;

namespace PDFSharpTest
{
    class Program
    {
        static void Main(string[] args)
        {
            PdfDocument document = PdfReader.Open(args[0]);
            Console.WriteLine($"Page count: {document.PageCount}.");
        }
    }
}

The library I am using is PdfSharpCore 1.1.26 downloaded from NuGet.

The file appears to be okay as I am able to open it without any issues by the Acrobat Reader. JLLAP_14022018101618_EMAIL_14022018101618_99_00005_001 pdf opened in Acrobat Reader

fionik avatar May 27 '20 02:05 fionik

It appears to be the source of the problem is this literal.

\\uslil620.am.jllnet.com\invoices\2018\AUD\201801\A2210_AU003-0105254__5106022609_AU003.pdf

I understand that all these slashes should have been escaped, but they haven't been escaped which is technically standard violation. On other hand this document says:

If the character following the backslash is not one of those shown in the table, the backslash is ignored.

So to me it looks like the library should have ignored the illegal backslash rather than throwing an exception.

fionik avatar May 27 '20 12:05 fionik

I'm sure you realise that the document above contains sensitive information that you don't want disclosed on the internet?

chrisnurse avatar Jan 25 '21 06:01 chrisnurse

I'm sure you realise that the document above contains sensitive information that you don't want disclosed on the internet?

I would be happy if there was a secure way to communicate such documents to the developers without disclosing them on the Internet.

fionik avatar Jan 25 '21 06:01 fionik

Guys is there any patch or fix related to this issue, as I am currently facing exact same issue. It would be a great help if some information is shared.

ranausman008 avatar Jul 23 '22 00:07 ranausman008

Save the pdf file with a different name and merge it.

jungwonbae avatar Feb 23 '24 05:02 jungwonbae