stac-fastapi
stac-fastapi copied to clipboard
User transforms on item documents
I would like to inject custom code to transform item documents in some way as they travel from database to the user. I feel like this would be a useful feature to have regardless of the db backend.
My specific use-case is to adjust asset hrefs from relative paths to absolute based on the original self link recorded in the database. I would like to keep storing relative links for assets in the database, but since item self link changes when accessed over api those hrefs become invalid. Recording those in absolute form would solve this specific problem, but makes data relocation more involved.
Alternatively an option to "make asset href absolute using original self link" can be useful and avoids the need to write custom code.
in pgstac this can inside this function for example:
https://github.com/stac-utils/stac-fastapi/blob/162a1a2c324b4c2bfe3451f7ae19d7840a0e0452/stac_fastapi/pgstac/stac_fastapi/pgstac/core.py#L187-L191
Another use case for this feature would be to convert s3://
asset hrefs to s3 signed https://
urls dynamically for users with particular authorizations, or to dynamically inject additional s3 signed url assets conforming to the alternate assets STAC Extension.
For my use case, I ended up just overriding the CoreCrudClient.get_item()
method since we only wanted to add the extra signed urls on the Get Item endpoint. I don't think this solution would scale all that well as it would need to be added to each individual method and could be conflated with potentially unrelated logic.
Another solution for these types of problems could be a FastAPI middleware which didn't work quite as well for my use case in that it was difficult to extract which route was being operated on for any given invocation of the middleware function. I had the same problem when I dropped to Starlette's ASGI middleware introspecting the provided Scope
.
My solution was to monkey-patch stac_fastapi.pgstac.core.Item
, it's very little code, but there is no way to detect if item was patched up already, so user_hook
might be called on an already patched item, not a problem in my case though.
def install_item_hook(user_hook):
"""Patch pgstac to feed data through user_hook."""
# pylint: disable=import-outside-toplevel
import stac_fastapi.pgstac.core
from stac_fastapi.types.stac import Item
def _item_hook(*args, **kwargs):
return user_hook(Item(*args, **kwargs))
stac_fastapi.pgstac.core.Item = _item_hook
and here the hook I needed:
def make_asset_links_absolute(item):
"""Patch assets[*].href to be absolute links."""
# note this can be called on a patched item also
self_link = None
for link in item["links"]:
if link["rel"] == "self":
self_link = link["href"]
break
if self_link is None:
return item
# assumes self link points to json
prefix = "/".join(self_link.split("/")[:-1])
for asset in item["assets"].values():
href = asset["href"]
if ":" not in href:
asset["href"] = f"{prefix}/{href}"
return item
install_item_hook(make_asset_links_absolute)
As you mention injecting custom transforms like this at the API level is difficult to do reliably without writing lots of custom code for each endpoint. I agree that the best approach for this use case is to subclass the appropriate backend and override methods accordingly.
Adjusting asset hrefs and links at the API level I think is more feasible, and related to #191
I agree that the best approach for this use case is to subclass the appropriate backend and override methods accordingly.
@geospatial-jeff I should have probably gone with that approach, looks like it's not a huge surface area to cover, it's just a bit tricky to find exact information on the backend interface from docs alone. Now that I'm more familiar with the internals of this code-base I would approach this differently.
This is how it is done in Microsoft Planetary Computer: https://github.com/microsoft/planetary-computer-apis/blob/b0471ea9f5e84268294b48bc22432dba93907331/pcstac/pcstac/client.py#L217