gdal icon indicating copy to clipboard operation
gdal copied to clipboard

Allow writing VRT from gdal_calc

Open dbaston opened this issue 1 year ago • 8 comments

Feature description

gdal_calc performs its calculations eagerly. It would be useful to be able to describe a transformation of remote datasets (e.g., NDVI) and store the result as a VRT.

Possible implementations:

  • Translate the expression to a Python pixel function (simpler, would require user of the VRT to set GDAL_VRT_ENABLE_PYTHON=YES
  • Translate simple expressions to C++ pixel functions (more complex, would likely result in lots of nesting)
  • Develop a new pixel function that can evaluate a simple expression, and have gdal_calc write a simple VRT containing the expression

Additional context

No response

dbaston avatar May 29 '24 18:05 dbaston

@hobu I've done a quick test with the exprtk library. It seems a bit overkill for this purpose, but it ships as a single MIT-licensed header and integrates very easily as a GDAL pixel function. For example, the following evaluates successfully:

<VRTDataset rasterXSize="1" rasterYSize="1">
<VRTRasterBand dataType="Float64" band="1" subClass="VRTDerivedRasterBand">
  <PixelFunctionType>expression</PixelFunctionType>
  <PixelFunctionArguments expression="if (A > B) 1.5*C ; else A" />
  <SimpleSource>
    <SourceFilename relativeToVRT="0">/tmp/pytest-of-dan/pytest-79/test_vrt_pixelfn_expression_co0/source_0.tif</SourceFilename>
    <SourceBand>1</SourceBand>
  </SimpleSource>
  <SimpleSource>
    <SourceFilename relativeToVRT="0">/tmp/pytest-of-dan/pytest-79/test_vrt_pixelfn_expression_co0/source_1.tif</SourceFilename>
    <SourceBand>1</SourceBand>
  </SimpleSource>
  <SimpleSource>
    <SourceFilename relativeToVRT="0">/tmp/pytest-of-dan/pytest-79/test_vrt_pixelfn_expression_co0/source_2.tif</SourceFilename>
    <SourceBand>1</SourceBand>
  </SimpleSource>
</VRTRasterBand>
</VRTDataset>

It would nice to figure out a way to specify a name for each <SimpleSource> and make that source name visible to the expression parser.

dbaston avatar Nov 04 '24 17:11 dbaston

@rouault I'm working through a basic implementation of gdal raster calc. One issue I'm wrestling with is whether calc should support multiple input files, which implies a need to support the various options of gdal_calc.py --extent. This lets us offer a simple syntax along the lines of gdal_calc.py, but calc ends up with some logic that is duplicative of gdal raster buildvrt. Or if we have multiple inputs, should the calc step happen in a pipeline after buildvrt ? This seems like a cleaner implementation but a more obscure usage.

I guess I'm leaning towards a syntax more gdal_calc.py but curious if you have thoughts.

dbaston avatar Jan 22 '25 01:01 dbaston

One issue I'm wrestling with is whether calc should support multiple input files,

I do think this would be more user friendly, although I'm not totally sure the current approach of gdal_calc.py of having --A=filename, --B=filename with only filename letters is the best one

Maybe "--input=name=NIR,file=rgbnir.tif,band=4" ? Hum, not so convinced, especially if file is not a simple filename, but a subdataset syntax...

rouault avatar Jan 22 '25 12:01 rouault

Maybe "--input=name=NIR,file=rgbnir.tif,band=4" ? Hum, not so convinced...

I think that @sgillies dislikes anyway, so better to ask early what he would suggest. But at least there are no quotation marks.

jratike80 avatar Jan 22 '25 12:01 jratike80

whether calc should support multiple input files, which implies a need to support the various options of gdal_calc.py --extent

Thinking this over some more, I think it makes sense for gdal raster calc to take multiple inputs but require them to have the same dimensions. I think this is the 99% use case, and for the remaining 1% a VRT input can be created elsewhere that does exactly what the user needs.

although I'm not totally sure the current approach of gdal_calc.py of having --A=filename, --B=filename with only filename letters is the best one

I don't especially care for it either.

As another alternative, rio calc uses --name "a=tests/data/RGB.byte.tif" and leaves selection of the band to the expression itself. This might work in ExprTk (where a could be a vector) but could be awkward for muparser.

dbaston avatar Jan 22 '25 13:01 dbaston

QGIS also performs band selection within the expression (syntax: raster_name@band_num). It does not appear to allow applying an expression to all bands of a raster.

dbaston avatar Jan 23 '25 16:01 dbaston

Thinking this over some more, I think it makes sense for gdal raster calc to take multiple inputs but require them to have the same dimensions. I think this is the 99% use case, and for the remaining 1% a VRT input can be created elsewhere that does exactly what the user needs.

Less confident in this being an adequate solution. I was trying to pull together an example of the tasseled cap transformation from a remote Sentinel 2 image, but gdal raster calc fails because bands have different resolutions:

export ROOT=/vsis3/sentinel-cogs/sentinel-s2-l2a-cogs/18/T/XQ/2024/8/S2B_18TXQ_20240829_0_L2A
gdal raster calc \
        --config CPL_DEBUG=ON \
        --config AWS_NO_SIGN_REQUEST=YES \
        --input "B1=${ROOT}/B01.tif" \
        --input "B2=${ROOT}/B02.tif" \
        --input "B3=${ROOT}/B03.tif" \
        --input "B4=${ROOT}/B04.tif" \
        --input "B5=${ROOT}/B05.tif" \
        --input "B6=${ROOT}/B06.tif" \
        --input "B7=${ROOT}/B07.tif" \
        --input "B8=${ROOT}/B08.tif" \
        --input "B8A=${ROOT}/B8A.tif" \
        --input "B9=${ROOT}/B09.tif" \
        --input "B11=${ROOT}/B11.tif" \
        --input "B12=${ROOT}/B12.tif" \
        --calc "-0.0635*B1 - 0.1128*B2 - 0.1680*B3 - 0.3480*B4 -0.3303*B5 + 0.0852*B6 + 0.3302*B7 + 0.3165*B8 + 0.0467*B9 - 0.0009*B10 - 0.4578*B11 - 0.4064*B12 + 0.3625*B8A" --output tasseled.vrt

Maybe the inputs should be required to have the same CRS and extent, but we should handle differing resolutions within gdal raster calc.

dbaston avatar Feb 10 '25 17:02 dbaston

Maybe the inputs should be required to have the same CRS and extent, but we should handle differing resolutions within gdal raster calc.

we can use GDALBuildVRT() under the hood to harmonize resolution, and possibly extent, although for the later we'd probably require the user to specify an explicit option like https://gdal.org/en/stable/programs/gdal_pansharpen.html#cmdoption-spat_adjust

rouault avatar Feb 10 '25 17:02 rouault