pystac-client icon indicating copy to clipboard operation
pystac-client copied to clipboard

Federated search

Open matthewhanson opened this issue 4 years ago • 4 comments

A big advantage of STAC is being able to use data from multiple sources. It would be a nice feature to be able to search multiple STAC endpoints and combine the results into a single FeatureCollection

matthewhanson avatar Apr 19 '21 20:04 matthewhanson

I have questions. First, would this be enough to support your use case, @matthewhanson?

import pystac_client
from pystac_client import Client

client_a = Client.open("http://stac-api-a.test")
client_b = Client.open("http://stac-api-b.test")

search_a = client_a.search(collections=["foo"], datetime="2023-06-07")
search_b = client_b.search(collections=["bar"], datetime="2023-06-07")

items = search_a.item_collection()
items.extend(search_b.item_collection())

If that's enough, then we just need to add an .extend() method to ItemCollection in pystac.

If that's not enough, I'm at a bit of a loss. Each STAC API tends to be so different that it doesn't seem realistic to, e.g., use the same collection IDs across clients. If you want to re-use the same set of parameters, it's pretty trivial to do this:

query = {
   "datetime": "2023-06-07",
   "bbox": [-73.21, 43.99, -73.12, 44.05],
}
items = client_a.search(collections=["foo"], **query).item_collection()
items.extend(client_b.search(collections=["bar"], **query).item_collection())

@matthewhanson, an you sketch out what you had in mind, if it's more than what I've described?

gadomski avatar Jun 07 '23 15:06 gadomski

The important thing here would be to ensure that if an order was specified in the search that the results would be interleaved based on that order.

bitner avatar Nov 02 '23 16:11 bitner

Quick and dirty proof of concept for a federated search that merges records according to their sortby settings.

from pystac_client import Client
import morecantile
import heapq
from functools import reduce, cmp_to_key

dot_get = lambda p, d: reduce(dict.get, p.split('.'), d)

def ogc_sort_func(sorts, a, b, depth=0):
    sort = sorts[depth]
    # print(sort, depth)
    field = sort.get('field')
    direction = sort.get('direction','asc')
    desc = 1 if direction.lower()[0] == 'd' else -1
    # print(field, direction)
    av = dot_get(field,a)
    bv = dot_get(field,b)
    # print(av, bv, av==bv)
    if (av is None and bv is None) or av == bv:
        # print('stepping through', sorts, a, b)
        return ogc_sort_func(sorts, a, b, depth=depth+1)
    elif av is None:
        out = -1
    elif bv is None:
        out = 1
    elif av < bv:
        out = 1
    else:
        out = -1
    return desc * out

tms = morecantile.tms.get("WebMercatorQuad")
x, y, z = tms.tile(-93,45,5)
bbox = list(tms.bounds(morecantile.Tile(x, y, z)))
print(bbox)

sortby = [{"field":"properties.datetime","direction":"desc"},{"field":"id","direction":"desc"}]
datetime=["2020-10-10","2020-10-10T18:00:00Z"]
catalog = Client.open('https://planetarycomputer.microsoft.com/api/stac/v1')
results = catalog.search(
    limit=100,
    max_items=1000,
    bbox=bbox,
    collections=["naip"],
    datetime=datetime,
    sortby=sortby
)
a=results.items_as_dicts()

results = catalog.search(
    limit=100,
    max_items=1000,
    bbox=bbox,
    datetime=datetime,
    collections=["landsat-c2-l2"],
    sortby=sortby
)

b=results.items_as_dicts()

results = catalog.search(
    limit=100,
    max_items=1000,
    bbox=bbox,
    datetime=datetime,
    collections=["sentinel-2-l2a"],
    sortby=sortby
)

c=results.items_as_dicts()

keyfunc = lambda l, r: ogc_sort_func(sortby, l, r)

print('merging')
g=heapq.merge(a,b,c, key=cmp_to_key(keyfunc))

print('cycling')
for i in range(100):
    row=next(g)
    print(dot_get('properties.datetime', row), row.get('id'),row.get('collection') )

bitner avatar Nov 02 '23 22:11 bitner

For that, I did the sorting just on the items as dicts, but if we were to actually implement this, you could use Items as classes and either create a new subclass or monkeypatch a lt method onto it.

bitner avatar Nov 02 '23 22:11 bitner