gdal
gdal copied to clipboard
vector PDF wrongly recognized as raster-only and not processed
What is the bug?
Vector PDFs (e.g. this minimal example) containing data in the XObject forms are recognized as raster-only by ogr2ogr
and therefore not processed at all.
Steps to reproduce the issue
> ogr2ogr data.shp data.pdf --debug on
...
PDF: Skipping unknown object /Fm3 at line 3
PDF: ParseContent(): reached line 4
PDF: ParseContent(): reached line 16
PDF: This is a raster-only PDF dataset, but it has been opened in vector-only mode
ERROR 1: Unable to open datasource `data.pdf' with the following drivers.
On the other hand, this works fine:
gdal_translate data.pdf data.tiff
Versions and provenance
GDAL 3.9.2, released 2024/08/13 on Debian Trixie.
Additional context
I presume that XObject forms containing actual data are ignored (see this line in the code and the error message PDF: Skipping unknown object /Fm3 at line 3
), therefore the program assumes there is no vector data in the file and so it contains just raster data.