pystac-client
pystac-client copied to clipboard
Search is creating multiple http requests
When I perform this search through the Pystac client it seems that instead of sending 1 request to the APIs by calling the GET /search method, it first calls the landing page, and then getting all the collections ,and only then performing the search. I checked this by controlling the traffic going through the proxy and it seemed quite weird. In particular this causes problems when I try to do simultaneous searches from parallel processes since the web server feels a sort of attack.
catalog = Client.open("...")
my_search = catalog.search(
max_items=100,
collections=['EO.Copernicus.S2.L2A'],
bbox = [11.2, 46.4, 11.4, 46.5],
query={"eo:cloud_cover":{"lt":70}},
datetime=['2023-01-01T00:00:00Z', '2023-01-02T00:00:00Z'],
method='GET')
I think that initial request is needed to check the conformance classes the
API publishes. You might be able to disable it with
ignore_conformance=True
. The docs have a bit of info on conformance:
https://pystac-client.readthedocs.io/en/stable/usage.html.
On Fri, Jan 12, 2024 at 9:43 AM chiarch84 @.***> wrote:
When I perform this search through the Pystac client it seems that instead of sending 1 request to the APIs by calling the GET /search method, it first calls the landing page, and then getting all the collections ,and only then performing the search. I checked this by controlling the traffic going through the proxy and it seemed quite weird. In particular this causes problems when I try to do simultaneous searches from parallel processes wince the web server feels a sort of attack.
catalog = Client.open("...") my_search = catalog.search( max_items=100, collections=['EO.Copernicus.S2.L2A'], bbox = [11.2, 46.4, 11.4, 46.5], query={"eo:cloud_cover":{"lt":70}}, datetime=['2023-01-01T00:00:00Z', '2023-01-02T00:00:00Z'], method='GET')
— Reply to this email directly, view it on GitHub https://github.com/stac-utils/pystac-client/issues/627 or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIWOCSSSJKSAXKRJHNTYOFKYXBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJLJONZXKZNENZQW2ZNLORUHEZLBMRPXI6LQMWBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTLDTOVRGUZLDORPXI6LQMWSUS43TOVS2M5DPOBUWG44SQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKJTGQZTSOJQGUYTTAVEOR4XAZNFNFZXG5LFUV3GC3DVMWVDEMBXHEYTGNBRGEY2O5DSNFTWOZLSUZRXEZLBORSQ . You are receiving this email because you are subscribed to this thread.
Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .
this causes problems when I try to do simultaneous searches from parallel processes
That is a fair point. In the future it might make sense for pystac-client to make it easier to skip that initial GET
.
In the meantime I took a look at the code and it doesn't look like ignore_conformance=True
will skip the initial GET
. That argument just makes it so the conformance classes are not considered when the user requests certain actions.
If you want to avoid any superfluous network calls, I would recommend skipping the Client
object entirely and just using ItemSearch
directly. Here is what that would look like:
from pystac_client import ItemSearch
search = ItemSearch(url="https://earth-search.aws.element84.com/v1/search", collections=["cop-dem-glo-30"], max_items=1)
Notice that the url ends in /search
.
Thanks for the suggestion! I will try!