pystac icon indicating copy to clipboard operation
pystac copied to clipboard

Confusing extension detection

Open soxofaan opened this issue 1 year ago • 4 comments

Using pystac 1.10.1:

import pystac

stac_obj = pystac.Catalog(
    id="foo",
    description="Foo",
    stac_extensions=[
        "https://stac-extensions.github.io/datacube/v2.2.0/schema.json"
    ]
)

assert stac_obj.ext.has("cube")

stac_obj.ext.cube

the assert works, but the last line fails with

AttributeError: 'CatalogExt' object has no attribute 'cube'

Is that intended behavior?

To use catalog.ext.cube in a generic way properly I guess I have to guard it with an additional hasattr:

if stac_obj.ext.has("cube") and hasattr(stac_obj.ext, "cube"):
    x = stac_obj.ext.cube ...

I'd hoped that stac_obj.ext.has("cube") would be just enough to use as guard

(Note: I'm aware that a Catalog is not supposed to have the datacube extension enabled, but I want to make my code robust against slightly "invalid" STAC data too)

soxofaan avatar Aug 20 '24 20:08 soxofaan

I'm aware that a Catalog is not supposed to have the datacube extension enabled, but I want to make my code robust against slightly "invalid" STAC data too

I'd argue that pystac is working as intended. As an implementation of the spec and its extensions, the code provided (IMO) should reflect the valid use-case, with methods and functions to help bring invalid STAC into a valid state.

In this example, it's not clear what type catalog.ext.cube should even be. CollectionDatacubeExtension, ItemDatacubeExtension, AssetDatacubeExtension, and ItemAssetsDatacubeExtension are each unique implementations of DatacubeExtension for specific STAC object types. There's no equivalent CatalogDatacubeExtension to return from catalog.ext.cube.

Each extension implements its own checks to see if it is valid for an object type in the ext classmethod. Instead of your hasattr check, you could use ext:

try:
    cube = DatacubeExtension.ext(object)
except ExtensionTypeError:
    cube = None

This still won't get you datacube extension information for a catalog, however. For that, you'll need to manipulate the catalog's attributes directly via extra_fields: dict[str, Any]

gadomski avatar Aug 21 '24 12:08 gadomski

In this example, it's not clear what type catalog.ext.cube should even be.

I agree, that's not the thing. My point is if catalogs don't support the (data)cube extension per STAC spec, it's weird to still get True from catalog.ext.has("cube")

soxofaan avatar Aug 21 '24 14:08 soxofaan

My point is if catalogs don't support the (data)cube extension per STAC spec, it's weird to still get True from catalog.ext.has("cube")

It's a good point. has_extension just looks at the stac_extensions field, without considering whether those extension urls are valid when applied to that object.

I don't have a good solution to the problem in a code sense, so maybe better documentation on has is the answer?

gadomski avatar Aug 21 '24 17:08 gadomski

I guess we could have has do more, but like @gadomski said, it is just a different invocation of has_extension. If we were to extend it to check validity then the response would not be True or False in the case you describe - it would be an error.

If you are just trying to get the cube extension class you don't really need has at all:

if hasattr(stac_obj.ext, "cube"):
    x = stac_obj.ext.cube ...

or

x = getattr(stac_obj.ext, "cube", None)

jsignell avatar Sep 04 '24 14:09 jsignell