PdfPig
PdfPig copied to clipboard
IndexOutOfRangeException in ByteEncodingCMapTable.CharacterCodeToGlyphIndex Method
Issue Description:
There is an issue in the CharacterCodeToGlyphIndex method within the ByteEncodingCMapTable class, which implements the ICMapSubTable interface in the namespace UglyToad.PdfPig.Fonts.TrueType.Tables.CMapSubTables.
The method is encountering a System.IndexOutOfRangeException due to an attempt to access an index outside the bounds of the glyphMapping array. Below is the implementation of the method:
public int CharacterCodeToGlyphIndex(int characterCode)
{
if (characterCode < 0 || characterCode >= GlyphMappingLength)
{
return 0;
}
return glyphMapping[characterCode];
}
Exception Details:
System.IndexOutOfRangeException: Index was outside the bounds of the array.
at UglyToad.PdfPig.Fonts.TrueType.Tables.CMapSubTables.ByteEncodingCMapTable.CharacterCodeToGlyphIndex(Int32 characterCode) in D:\..........UglyToad.PdfPig.Fonts\TrueType\Tables\CMapSubTables\ByteEncodingCMapTable.cs:line 57
Exception thrown: 'System.IndexOutOfRangeException' in UglyToad.PdfPig.Fonts.dll
Investigation Insights:
Upon examining the values, it appears that the exception is caused by a discrepancy between GlyphMappingLength and the actual length of the glyphMapping array:
GlyphMappingLength = 256glyphMapping = {byte[252]}
Attached Document:
I have also attached a document, example.pdf
, which might help in reproducing and investigating the issue further.
Steps to Reproduce:
- Use the
CharacterCodeToGlyphIndexmethod with an inputcharacterCodegreater than or equal to the length of theglyphMappingarray but less thanGlyphMappingLength. - The method throws
System.IndexOutOfRangeException.
Sample Code to Reproduce the Issue:
using var document = PdfDocument.Open(path);
var text = new StringBuilder();
foreach (var page in document.GetPages())
{
text.AppendLine(string.Join(" ", page.GetWords()));
}
return text.ToString();
Proposed Solution:
Ensure that both the GlyphMappingLength and the glyphMapping array are synchronized in terms of their lengths to avoid out-of-bounds access, or update the method to validate against the actual length of glyphMapping rather than GlyphMappingLength.
Additional Context:
The issue is observed in the following file:
D:......\UglyToad.PdfPig.Fonts\TrueType\Tables\CMapSubTables\ByteEncodingCMapTable.cs
Please let me know if you need any further information or clarification.
Hi @darbid, thanks a lot for the detailed issue. I think the easiest would be to update the CharacterCodeToGlyphIndex method.
Are you willing to create a PR to fix that? Or I can take care of it, just let me know
EDIT: I think the following should be enough
public int CharacterCodeToGlyphIndex(int characterCode)
{
if (characterCode < 0 || characterCode >= glyphMapping.Length)
{
return 0;
}
return glyphMapping[characterCode];
}
I am sorry I'm not experienced enough to do a PR. Could you please?
no prob, done