PdfPig
PdfPig copied to clipboard
"Object reference not set to an instance of an object."
STR
PdfDocument.Open(pdfBytes) with the some PDF file. As it contains sensitive data, i unfortunately cannot attach it here and I was unfortunately unable to create a minimal example, but some hints:
- it was a quite complex one with images and some header etc. (not much text though)
- contains some comments shown on hover (but not with the Acrobat icon it usually shows)
- it was last edited with Firefox and some notes and drawings were added
- it was last edited at 2023-08/2023-09
Much like this and I tried to reproduce it with this example, but it does not work:
Thus, i only attach this image, because with the PDF I've created it is not reproducible.
What happens
System.NullReferenceException
HResult=0x80004003
Nachricht = Object reference not set to an instance of an object.
Quelle = UglyToad.PdfPig
Stapelüberwachung:
bei UglyToad.PdfPig.PdfExtensions.TryGet[T](DictionaryToken dictionary, NameToken name, IPdfTokenScanner tokenScanner, T& token)
Apparently, this is the line of failure: https://github.com/UglyToad/PdfPig/blob/a99c0d25bfe76e4e7a919a42c52c99022ac769d3/src/UglyToad.PdfPig/PdfExtensions.cs#L24
What should happen
At least PdfDocumentFormatException if you consider the file invalid.
However, IMHO, the file is valid an can be opened with both Adobe Acrobat Reader and Firefox. Thus, actually parsing it would be good.
Also, when opening it with Adobe Acrobat Reader and re-saving it, it can be parsed!
System
PDFPig 0.1.8 reproducible on Windows 10
Interne Referenz: 2118
Hi @rklec it's going to be complicated to help you without the document...
Can you try with the latest version of PdfPig (pre-release 1.9.0, available via Nuget packages)?
I'm running into this issue as well with the attached document. If I set SkipMissingFonts to true, the above exceptions gets thrown. When that option is not specified, I get the following exception instead: ErcotFacts.pdf
at UglyToad.PdfPig.Util.DictionaryTokenExtensions.GetNameOrDefault(DictionaryToken dictionaryToken, NameToken name)
at UglyToad.PdfPig.PdfFonts.Parser.Handlers.Type0FontHandler.ParseDescendant(DictionaryToken dictionary)
at UglyToad.PdfPig.PdfFonts.Parser.Handlers.Type0FontHandler.Generate(DictionaryToken dictionary)
at UglyToad.PdfPig.PdfFonts.FontFactory.Get(DictionaryToken dictionary)
at UglyToad.PdfPig.Content.ResourceStore.LoadFontDictionary(DictionaryToken fontDictionary)
at UglyToad.PdfPig.Content.ResourceStore.LoadResourceDictionary(DictionaryToken resourceDictionary)
at UglyToad.PdfPig.Content.BasePageFactory`1.Create(Int32 number, DictionaryToken dictionary, PageTreeMembers pageTreeMembers, NamedDestinations namedDestinations)
at UglyToad.PdfPig.Content.Pages.GetPage[TPage](IPageFactory`1 pageFactory, Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions)
at UglyToad.PdfPig.Content.Pages.GetPage(Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions)
at UglyToad.PdfPig.PdfDocument.GetPage(Int32 pageNumber)
at UglyToad.PdfPig.PdfDocument.<GetPages>d__34.MoveNext()
at System.Collections.Generic.LargeArrayBuilder`1.AddRange(IEnumerable`1 items)
at System.Collections.Generic.EnumerableHelpers.ToArray[T](IEnumerable`1 source)
at System.Linq.SystemCore_EnumerableDebugView`1.get_Items()
Any help with a fix for this would be greatly appreciated!
The linked ErcotFacts.pdf does not throw for me, surprisingly, though. (Encdoded and decoded in a mail, though)
Hi @rklec should have clarified, but the exception I'm seeing occurs when calling the GetPages() method.
For example:
using PdfDocument? document = PdfDocument.Open( stream );
if ( document is null )
{
_logger.LogWarning( "Failed to open PDF document" );
return result;
}
foreach ( var pg in document.GetPages() )
{
_logger.LogInformation( "Processing page {PageNumber}", pg.Number );
}
thanks for sharing the document, I've created a PR that fixes the issue when SkipMissingFonts = true
Much appreciated @BobLd