gdal icon indicating copy to clipboard operation
gdal copied to clipboard

vector PDF wrongly recognized as raster-only and not processed

Open mbudaj opened this issue 4 months ago • 3 comments

What is the bug?

Vector PDFs (e.g. this minimal example) containing data in the XObject forms are recognized as raster-only by ogr2ogr and therefore not processed at all.

Steps to reproduce the issue

> ogr2ogr data.shp data.pdf --debug on
...
PDF: Skipping unknown object /Fm3 at line 3
PDF: ParseContent(): reached line 4
PDF: ParseContent(): reached line 16
PDF: This is a raster-only PDF dataset, but it has been opened in vector-only mode
ERROR 1: Unable to open datasource `data.pdf' with the following drivers.

On the other hand, this works fine:

gdal_translate data.pdf data.tiff

Versions and provenance

GDAL 3.9.2, released 2024/08/13 on Debian Trixie.

Additional context

I presume that XObject forms containing actual data are ignored (see this line in the code and the error message PDF: Skipping unknown object /Fm3 at line 3), therefore the program assumes there is no vector data in the file and so it contains just raster data.

mbudaj avatar Oct 16 '24 17:10 mbudaj