stackstac icon indicating copy to clipboard operation
stackstac copied to clipboard

Dimension names from cube:dimensions

Open clausmichele opened this issue 1 year ago • 1 comments

It would be nice that, if at Collection or Item level the datacube extension is present, the provided dimension names would be reflected in the final returned xarray object. Currently, the dimension names are always the default ones:

Sample STAC Collection with datacube extension:

import json
import pystac
import pystac_client

url = "https://stac.eurac.edu/collections/SENTINEL2_L2A_SAMPLE"

stac_api = pystac_client.stac_api_io.StacApiIO()
stac_dict = json.loads(stac_api.read_text(url))
b_dim = None
t_dim = None
x_dim = None
y_dim = None
z_dim = None
if "cube:dimensions" in stac_dict:
    for dim in stac_dict["cube:dimensions"]:
        if stac_dict["cube:dimensions"][dim]["type"] == "bands":
            b_dim = dim
        if stac_dict["cube:dimensions"][dim]["type"] == "temporal":
            t_dim = dim
        if stac_dict["cube:dimensions"][dim]["type"] == "spatial":
            if stac_dict["cube:dimensions"][dim]["axis"] == "x":
                x_dim = dim
            if stac_dict["cube:dimensions"][dim]["axis"] == "y":
                y_dim = dim
            if stac_dict["cube:dimensions"][dim]["axis"] == "z":
                z_dim = dim
print(b_dim,t_dim,x_dim,y_dim,z_dim)

>>> bands t x y None

Result from stackstac:

import pystac_client
import stackstac

catalog_url = "https://stac.eurac.edu/"
collection = "SENTINEL2_L2A_SAMPLE"

catalog = pystac_client.Client.open(catalog_url)
query_params = {"collections": [collection]}

items = catalog.search(**query_params).item_collection()
data = stackstac.stack(items)
print(data.dims)

>>> ('time', 'band', 'y', 'x')

I understand that in the above example I'm passing STAC Items that do not contain the cube:dimensions field, which is provided only at Collection level. Would it make sense to give the option for using the naming convention from the STAC itself?

Same issue opened also for odc-stac, which has also default names: https://github.com/opendatacube/odc-stac/issues/136

clausmichele avatar Dec 18 '23 09:12 clausmichele

@clausmichele I believe the reason stackstac and odc-stac are providing "time", "band", "y", "x" because these are the conventions from rasterio isn't it? And I'm afraid stackstac is designed to support rasterio supported file format

Berhinj avatar Jan 03 '24 09:01 Berhinj