pystac icon indicating copy to clipboard operation
pystac copied to clipboard

Implement `alternate` extension

Open matthewhanson opened this issue 3 years ago • 5 comments

Add the alternate extension: https://github.com/stac-extensions/alternate-assets

Would be nice to have an easy way to to fetch an Asset that includes the alternate asset URL in place of the normal URL.

for example,

asset = item.get_asset('red', 's3')

And the asset returned has an href equal to the s3 URL rather than the original one from the asset.

In addition, alternate assets may contain fields from other extensions (e.g., storage), see: https://ibhoyw8md9.execute-api.us-west-2.amazonaws.com/prod/collections/landsat-c2l1/items/LC08_L1TP_225099_20210731_20210731_02_T1

Those should be merged into the asset returned (overriding any existing property)

matthewhanson avatar Aug 02 '21 21:08 matthewhanson

Would be nice to have an easy way to to fetch an Asset that includes the alternate asset URL in place of the normal URL.

for example,

asset = item.get_asset('red', 's3')

And the asset returned has an href equal to the s3 URL rather than the original one from the asset.

I think we probably want to keep this logic contained to the extension implementation rather than implementing it as part of pystac.Item. We don't add any other extension-specific logic to pystac.Item, and I think it would be best to stick to that to avoid bogging that class down with lots of additional functionality. I'm think the implementation of this would look something like:

item: pystac.Item = ...
alt_ext = AlternateExtension.ext(item)
s3_asset = alt_ext.get_asset("red", "s3")
print(s3_asset.href)
# s3://some-bucket...

In addition, alternate assets may contain fields from other extensions (e.g., storage), see: https://ibhoyw8md9.execute-api.us-west-2.amazonaws.com/prod/collections/landsat-c2l1/items/LC08_L1TP_225099_20210731_20210731_02_T1

I'm getting a 404 for that resource. Is there another example we could take a look at?

Those should be merged into the asset returned (overriding any existing property)

Not sure I totally understand this request. Any Asset that gets returned would just be a normal pystac.Asset, so other extension fields would be available either through Asset.extra_fields or by extending that Asset using the appropriate extension implementation (e.g. StorageExtension.ext(asset).region). Did you have other functionality that you were thinking about?

duckontheweb avatar Feb 08 '22 14:02 duckontheweb

Ok that makes sense. It would be nice to have something a bit more transparent...like some assets may have an s3 URL and some may not.

Here's an updated example: https://landsatlook.usgs.gov/stac-server/collections/landsat-c2l2-sr/items/LC09_L2SR_081116_20220215_20220217_02_T2_SR

The index asset doesn't have an s3 URL...and I could see some cases where data has an alternate s3 URL and metadata does not for some reason.

In that case I'd rather not build in that logic and just say "get me the s3 URLs if they exist.

This is related to the last question...see the example above. The alternate asset is not a complete description of the asset but only href and possibly title and description...as well as potentially fields from the storage extension. Fields like type or info from other extensions (e.g., eo:bands, gsd) would not be included. But when get the asset I want all the info about the asset, it's just that I want one of the other URLs, this is an actual duplication of the data...if file:checksum were used it should be the same for both files. So would the returned asset from the extension be a merge of the alternate info with the original asset (overriding original fields if they exist).

matthewhanson avatar Feb 18 '22 21:02 matthewhanson

Here's an updated example: https://landsatlook.usgs.gov/stac-server/collections/landsat-c2l2-sr/items/LC09_L2SR_081116_20220215_20220217_02_T2_SR

Thanks, I'll use that for testing and fleshing out the implementation.

This is related to the last question...see the example above. The alternate asset is not a complete description of the asset but only href and possibly title and description...as well as potentially fields from the storage extension. Fields like type or info from other extensions (e.g., eo:bands, gsd) would not be included. But when get the asset I want all the info about the asset, it's just that I want one of the other URLs, this is an actual duplication of the data...if file:checksum were used it should be the same for both files. So would the returned asset from the extension be a merge of the alternate info with the original asset (overriding original fields if they exist).

That makes sense. It should be straightforward to merge the top-level Asset fields with the fields specific to a given alternative asset. There may be a couple of wrinkles to work out around handling the ownership of these assets, handling the extra_fields, etc., but I can work those out in the PR.

duckontheweb avatar Feb 22 '22 19:02 duckontheweb

This implementation is going to be a bit different from our typical extension implementations, so I want to give some time for feedback. I'm moving this into the next minor release so we don't hold up 1.4.

duckontheweb avatar Feb 23 '22 16:02 duckontheweb

@matthewhanson I don't think I'm going to be able to implement this for the 1.5 release. If you (or someone else) has availability to work on this that would be helpful. Otherwise I'll push this off to a future release.

duckontheweb avatar Jun 28 '22 15:06 duckontheweb