gdal icon indicating copy to clipboard operation
gdal copied to clipboard

VRT connection "vrt://.." syntax progress/discussion

Open mdsumner opened this issue 2 years ago • 20 comments

The VRT connection vrt://<dsn>?opt1=<args>&opt2=<args> syntax was introduced in 3.1 with option bands, and slowly expanded in 3.7 and 3.8.

These generally match the gdal_translate syntax. The syntax varies, for example rather than "-b" used multiple times there is a single ?bands=1,2,3 for the syntax, and for a_ullr the values are also comma separated instead of by spaces as in the command line.

I've listed the remaining args here, to classify them in terms of utility, and progress, as a list for myself to refer back to and possibly to trigger interest from others.

When I started adding these I thought that many of these didn't make sense, but now I think they do. If you disagree or otherwise have comments I'd like to hear them.

Thanks!

done

see list here: https://gdal.org/drivers/raster/vrt.html#vrt-connection-string

doesn't have utility

  • of , sds

not done, key=value args

  • mask
  • colorinterp_bn
  • colorinterp
  • mo
  • co

not done, key=true/false args

  • strict
  • q
  • norat
  • noxmp
  • stats

new

  • sd_name ,`sd choose sds by name or number done in #8906
  • dmo add domain metadata items, like -mo

mdsumner avatar Mar 18 '23 09:03 mdsumner

  • sds (I really want this)

There's no way this can work with the vrt:// syntax. -sds is a special mode for the gdal_translate binary

rouault avatar Mar 18 '23 11:03 rouault

damn, I really wanted that 😃 - appreciate the explanation of why it can't work (I tried) 🙏

mdsumner avatar Mar 18 '23 12:03 mdsumner

damn, I really wanted that smiley - appreciate the explanation of why it can't work (I tried) pray

wasn't that obvious it can't work ? -sds makes a materialized copy as a geotiff file of every subdataset of the source dataset. That's totally at odds with the VRT approach that exposes a single non-materialized dataset

rouault avatar Mar 18 '23 12:03 rouault

wait! yes, what I want for sds is a selector by name or number

it's a new option, missing from my categories above 🙏

mdsumner avatar Mar 18 '23 12:03 mdsumner

I updated the list above, sds was a total mistake

mdsumner avatar Mar 18 '23 13:03 mdsumner

  • subdataset choose sds by name or number

Do we really need that? subdataset is just a dsn.

rouault avatar Mar 19 '23 12:03 rouault

I find it awkward to construct them, you need two steps to interrogate for the format and names - but still I wonder if it makes sense. A problem is the driver not being explicit compared to the proper dsn.

The thing I like about this vrt connection syntax is that it's so simple, has no software requirements, no file artefacts, and is universal (cross lang). I guess if the connection gets the facility the other apps also should have it too, though so I'll think about that.

mdsumner avatar Mar 19 '23 22:03 mdsumner

I'm working on subdataset option i.e. vrt://myfile.nc?subdataset=myarray, what do you think of having two possible forms so that an integer index of the subdataset can be used? i.e.

vrt://myfile.nc?subdataset_n=1

using the same numbering as SUBDATASET_1_NAME, SUBDATASET_2_NAME, ...

It's conceivable that a subdataset could have name "1" so I thought an extra arg would be a good idea.

mdsumner avatar Nov 29 '23 18:11 mdsumner

what do you think of having two possible forms so that an integer index of the subdataset can be used?

why is that needed ? Why not just using the subdataset name after vrt:// ?

rouault avatar Nov 29 '23 18:11 rouault

because dsn:name also requires the driver prefix DRV:dsn:name, the number is convenient because the names can be awkward - I feel like I haven't explained this properly, or maybe I'm missing something ... (will have a draft PR soon just chasing down a netcdf problem)

Say with oisst-avhrr-v02r01.20220218.nc, I have to have the driver identified and the name identified i.e.

NetCDF:oisst-avhrr-v02r01.20220218.nc:sst

but, gdal automatically detects the driver so I want my user-experience to be "get this name" or "get the first (or second or third ...)" one - I can't get the fully qualified sds name without opening the file or having a real human type it in.

These variants make it entirely automatable without invoking gdal at all (number is convenient but dangerous because name order can vary but I see that as a valid alternative mode for investigating sources)

sprintf("vrt://%s?subdataset=%s", dsn, name)

sprintf("vrt://%s?subdataset_n=%i", dsn, 1)

edit: made a few fixes

mdsumner avatar Nov 29 '23 18:11 mdsumner

I'm not opposed to your proposals. The doc should however mention that using subdataset_{n} might be fragile as there's no guarantee that subdataset numbering will remain stable among GDAL versions.

rouault avatar Nov 29 '23 21:11 rouault

ok no problem, thanks for the feedback!

mdsumner avatar Nov 29 '23 21:11 mdsumner

Apologies if this doesn't belong here, but has there been any consideration of adding a target extent option for the connection string? This is available in gdalbuildvrt, but not in gdal_translate.

ljstrnadiii avatar Jan 04 '24 16:01 ljstrnadiii

Apologies if this doesn't belong here, but has there been any consideration of adding a target extent option for the connection string?

this is the projwin option: https://gdal.org/programs/gdal_translate.html#cmdoption-gdal_translate-projwin

rouault avatar Jan 04 '24 16:01 rouault

@rouault great! And the origin will match the NW point of the provided bounds?

ljstrnadiii avatar Jan 04 '24 17:01 ljstrnadiii

And the origin will match the NW point of the provided bounds?

yes

rouault avatar Jan 04 '24 17:01 rouault