tensorstore
tensorstore copied to clipboard
Auto-detect driver
Hi! Is there a way to auto-detect the driver (and maybe other parts of the spec) of an already existing dataset? I am thinking of
>>> dataset = ts.open({ 'kvstore': 'gs://neuroglancer-janelia-flyem-hemibrain/v1.1/segmentation/' }).result()
TensorStore({
...
'driver': 'neuroglancer_precomputed',
'dtype': 'uint64',
'kvstore': {
'bucket': 'neuroglancer-janelia-flyem-hemibrain',
'driver': 'gcs',
'path': 'v1.1/segmentation/',
},
...
})
There isn't currently any format auto-detection logic, but it is something we discussed previously and I was inclined to implement that in conjunction with support for the URL syntax I proposed.
The syntax would probably be:
ts.open('gs://..') or ts.open({'driver': 'auto', 'kvstore': 'gs://...'})
If we have e.g. a zarr array at the root of a zip file (or similarly OCDBT database), then potentially that could also be auto-detected, e.g. gs://bucket/zipfile.zip would get auto-detected to gs://bucket/zipfile.zip|zip:|zarr3:. Not sure if auto-detection of kvstore adapter formats like that would also be supported, or just of the array formats.
In general the auto-detection would probably work by the various possible drivers specifying a set of relative paths to check and a number of header/trailer bytes required in case of a file rather than a directory.