gdal icon indicating copy to clipboard operation
gdal copied to clipboard

Python bindings: allow creating & exporting `Transformer` without always needing a source dataset

Open HansBrende opened this issue 1 year ago • 4 comments

Feature description

I commonly create transformers such as the following:

transformer: gdal.GDALTransformerInfoShadow = gdal.Transformer(my_dataset, None, ['DST_SRS=EPSG:4326'])

which allows me to easily map pixel coordinates to their WGS84 coordinates.

However, the problem I am facing is that I must open the original image every time I want to do this.

What would be ideal is if I could do this same thing once (as above), and then export the relevant data from the transformer that I could, for example, store in a database. And then reinitialize the transformer later with exactly the same semantics without needing to re-open the original dataset to ensure this!

E.g. (suggested syntax):

stuff_I_can_serialize = transformer.getSourceTransform(), transformer.getSourceTransformArg()

later...

transformer: gdal.GDALTransformerInfoShadow = gdal.TransformerFromXYZ(*stuff_I_can_serialize)

I have not tried pickling the transformer or anything like that because even if it worked (doubtful), I wouldn't want it to stop working if a backwards-incompatible change was made to the internal memory representation... being able to serialize/deserialize in some kind of human-readable way would be ideal.

Additional context

No response

HansBrende avatar Jun 14 '24 23:06 HansBrende

you can workaround that a bit, by converting your source dataset to a VRT file, and storing the VRT in your database. Just opening it for the sake of passing it to gdal.Transformer() doesn't require accessing the original raster file

rouault avatar Jun 14 '24 23:06 rouault

@rouault brilliant! I've just tested this strategy out and it works like a charm.

Only possible downside is that the VRT does include a lot of information that is not required, for example the VRTRasterBand elements.

I'm assuming that I could delete all of this unnecessary information in the actual storage process, keeping only a list of the XML elements which are actually required to reconstruct the transform, and then reconstruct a VRT file putting some junk data in for required elements such as VRTRasterBand.

The only elements I'd need to preserve, I'm assuming, would be:

  • SRS
  • GeoTransform
  • GCPList
  • plus any metadata items containing RPC information (which I believe are exactly those with attribute domain="RPC"?)

Is that correct or am I missing anything?

HansBrende avatar Jun 15 '24 00:06 HansBrende

Is that correct

yes

rouault avatar Jun 18 '24 22:06 rouault

Great, thanks! Well that gives me a viable path forward. Since the solution is a bit hacky, I'll leave this issue open in case the use-case is helpful in designing any future enhancements. Or if not, feel free to close!

HansBrende avatar Jun 19 '24 02:06 HansBrende