PdfPig
PdfPig copied to clipboard
Invalid ColorSpace token encountered in page resource dictionary: <ColorSpace, /DeviceRGB>
Hi, I'm using the latest version of PdfPig on NuGet (0.1.10).
Reading data from a PDF (that I cannot attach due to sensitive data) I get this exception using both GetPages() and GetPage(Int32 pageNumber) methods:
UglyToad.PdfPig.Core.PdfDocumentFormatException HResult=0x80131500 Message=Invalid ColorSpace token encountered in page resource dictionary: <ColorSpace, /DeviceRGB>. Source=UglyToad.PdfPig StackTrace: at UglyToad.PdfPig.Content.ResourceStore.LoadResourceDictionary(DictionaryToken resourceDictionary) at UglyToad.PdfPig.Content.BasePageFactory1.Create(Int32 number, DictionaryToken dictionary, PageTreeMembers pageTreeMembers, NamedDestinations namedDestinations)
at UglyToad.PdfPig.Content.Pages.GetPage[TPage](IPageFactory1 pageFactory, Int32 pageNumber, NamedDestinations namedDestinations, ParsingOptions parsingOptions) at UglyToad.PdfPig.PdfDocument.GetPage(Int32 pageNumber)
@tronci can you try with the latest prerelease version of PdfPig and confirm you still have the problem?
If yes, can you locally add your pdf to the tests to debug and find out which token type is you color space token? Exception comes from here https://github.com/UglyToad/PdfPig/blob/89abf6de546c674e6751c2480fc93fd29f5cd13f/src/UglyToad.PdfPig/Content/ResourceStore.cs#L124
@BobLd I still have the problem with latest git version.
The token type is {<ColorSpace, /DeviceRGB>}
Exception is generated elaborating this line https://github.com/UglyToad/PdfPig/blob/89abf6de546c674e6751c2480fc93fd29f5cd13f/src/UglyToad.PdfPig/Content/ResourceStore.cs#L106
@tronci can you add the following if case in the ResourceStore class and confirm you are able to parse the document? Also available in https://github.com/BobLd/PdfPig/blob/6d0df7f4daae4107422000faa1ec5880cae57f03/src/UglyToad.PdfPig/Content/ResourceStore.cs#L122C1-L134C25
I'm expecting your token to be a DictionaryToken.
else if (parsingOptions.UseLenientParsing && DirectObjectFinder.TryGet(nameColorSpacePair.Value, scanner, out DictionaryToken? dict) &&
dict.TryGet(NameToken.ColorSpace, scanner, out DictionaryToken? csDict))
{
// See issue #1061
foreach (var nameCsPair in csDict.Data)
{
if (DirectObjectFinder.TryGet(nameCsPair.Value, scanner, out NameToken? csName))
{
namedColorSpaces[NameToken.Create(nameCsPair.Key)] = new ResourceColorSpace(csName);
}
}
}
Hi, I cannot parse the document parsingOptions.UseLenientParsing is False DirectObjectFinder.TryGet(nameColorSpacePair.Value, scanner, out DictionaryToken? dict) is True dict.TryGet(NameToken.ColorSpace, scanner, out DictionaryToken? csDict) is False
dict.TryGet is False because {/DeviceRGB} in line 25 of PdfExtensions.cs is not a IndirectReferenceToken
@tronci can you try with the following change, same if case (also available here https://github.com/BobLd/PdfPig/blob/0d1009906691224d93c5b6887d44e86100112458/src/UglyToad.PdfPig/Content/ResourceStore.cs#L122C1-L128C22):
else if (parsingOptions.UseLenientParsing &&
DirectObjectFinder.TryGet(nameColorSpacePair.Value, scanner, out DictionaryToken? dict) &&
dict.TryGet(NameToken.ColorSpace, scanner, out NameToken? csName))
{
// See issue #1061
namedColorSpaces[name] = new ResourceColorSpace(csName);
}
As a side note, when opening your document, please use UseLenientParsing = true:
using (var document = PdfDocument.Open(path, new ParsingOptions() { UseLenientParsing = true }))
{
[...]
}
Now with LenientParsing On and the new if it works like a charm ^_^
Fix now merged, will be available in the latest pre-release version by tomorrow