pdfparser
pdfparser copied to clipboard
Bugfix for "Unable to find startxref" exception
Type of pull request
- [Y] Bug fix (involves code and configuration changes)
- [ ] New feature (involves code and configuration changes)
- [ ] Documentation update
- [ ] Something else
About
Addresses inadequacies in handling startxref statement before EOF marker.
Currently, getXrefData() uses a regex that requires a newline before the offset value.
This fix allows:
- space before the reference offset.
- keyword and offset on the same line (e.g. containing "startxref 1746580").
For more detail see #756
TO DO
- tests
- example files