PdfPig icon indicating copy to clipboard operation
PdfPig copied to clipboard

"Object reference not set to an instance of an object."

Open rklec opened this issue 1 year ago • 6 comments

STR

PdfDocument.Open(pdfBytes) with the some PDF file. As it contains sensitive data, i unfortunately cannot attach it here and I was unfortunately unable to create a minimal example, but some hints:

  • it was a quite complex one with images and some header etc. (not much text though)
  • contains some comments shown on hover (but not with the Acrobat icon it usually shows)
  • it was last edited with Firefox and some notes and drawings were added
  • it was last edited at 2023-08/2023-09

Much like this and I tried to reproduce it with this example, but it does not work: grafik

Thus, i only attach this image, because with the PDF I've created it is not reproducible.

What happens

System.NullReferenceException
  HResult=0x80004003
  Nachricht = Object reference not set to an instance of an object.
  Quelle = UglyToad.PdfPig
  Stapelüberwachung:
   bei UglyToad.PdfPig.PdfExtensions.TryGet[T](DictionaryToken dictionary, NameToken name, IPdfTokenScanner tokenScanner, T& token)

Apparently, this is the line of failure: https://github.com/UglyToad/PdfPig/blob/a99c0d25bfe76e4e7a919a42c52c99022ac769d3/src/UglyToad.PdfPig/PdfExtensions.cs#L24

What should happen

At least PdfDocumentFormatException if you consider the file invalid.

However, IMHO, the file is valid an can be opened with both Adobe Acrobat Reader and Firefox. Thus, actually parsing it would be good.

Also, when opening it with Adobe Acrobat Reader and re-saving it, it can be parsed!

System

PDFPig 0.1.8 reproducible on Windows 10

Interne Referenz: 2118

rklec avatar Aug 06 '24 13:08 rklec

Hi @rklec it's going to be complicated to help you without the document...

Can you try with the latest version of PdfPig (pre-release 1.9.0, available via Nuget packages)?

BobLd avatar Aug 06 '24 13:08 BobLd

I'm running into this issue as well with the attached document. If I set SkipMissingFonts to true, the above exceptions gets thrown. When that option is not specified, I get the following exception instead: ErcotFacts.pdf

   at UglyToad.PdfPig.Util.DictionaryTokenExtensions.GetNameOrDefault(DictionaryToken dictionaryToken, NameToken name)
   at UglyToad.PdfPig.PdfFonts.Parser.Handlers.Type0FontHandler.ParseDescendant(DictionaryToken dictionary)
   at UglyToad.PdfPig.PdfFonts.Parser.Handlers.Type0FontHandler.Generate(DictionaryToken dictionary)
   at UglyToad.PdfPig.PdfFonts.FontFactory.Get(DictionaryToken dictionary)
   at UglyToad.PdfPig.Content.ResourceStore.LoadFontDictionary(DictionaryToken fontDictionary)
   at UglyToad.PdfPig.Content.ResourceStore.LoadResourceDictionary(DictionaryToken resourceDictionary)
   at UglyToad.PdfPig.Content.BasePageFactory`1.Create(Int32 number, DictionaryToken dictionary, PageTreeMembers pageTreeMembers, NamedDestinations namedDestinations)
   at UglyToad.PdfPig.Content.Pages.GetPage[TPage](IPageFactory`1 pageFactory, Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions)
   at UglyToad.PdfPig.Content.Pages.GetPage(Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions)
   at UglyToad.PdfPig.PdfDocument.GetPage(Int32 pageNumber)
   at UglyToad.PdfPig.PdfDocument.<GetPages>d__34.MoveNext()
   at System.Collections.Generic.LargeArrayBuilder`1.AddRange(IEnumerable`1 items)
   at System.Collections.Generic.EnumerableHelpers.ToArray[T](IEnumerable`1 source)
   at System.Linq.SystemCore_EnumerableDebugView`1.get_Items()

Any help with a fix for this would be greatly appreciated!

jmjohnson05 avatar Aug 21 '24 19:08 jmjohnson05

The linked ErcotFacts.pdf does not throw for me, surprisingly, though. (Encdoded and decoded in a mail, though)

rklec avatar Aug 22 '24 16:08 rklec

Hi @rklec should have clarified, but the exception I'm seeing occurs when calling the GetPages() method.

For example:

using PdfDocument? document = PdfDocument.Open( stream );

if ( document is null )
{
    _logger.LogWarning( "Failed to open PDF document" );

    return result;
}

foreach ( var pg in document.GetPages() ) 
{
    _logger.LogInformation( "Processing page {PageNumber}", pg.Number );
}

jmjohnson05 avatar Aug 22 '24 17:08 jmjohnson05

thanks for sharing the document, I've created a PR that fixes the issue when SkipMissingFonts = true

BobLd avatar Aug 22 '24 20:08 BobLd

Much appreciated @BobLd

jmjohnson05 avatar Aug 24 '24 22:08 jmjohnson05