pystac-client icon indicating copy to clipboard operation
pystac-client copied to clipboard

Fix double-encoding of intersects in GET requests

Open gadomski opened this issue 3 years ago • 4 comments

Related Issue(s):

  • Closes #335

Description:

The stringified intersects dictionary was being double-stringified. This fix uses urllib.parse.quote_plus to urlencode the intersects string for GET requests.

Includes:

  • Request parameter assertions using pytest-httpserver
  • Denser intersects stringification to increase chance of GET request being short enough

This fix proved hard to test using the existing procedures, because stac-fastapi does not support intersects for GET requests: https://github.com/stac-utils/stac-fastapi/issues/468.

PR Checklist:

  • [x] Code is formatted
  • [x] Tests pass
  • [x] Changes are added to the CHANGELOG

gadomski avatar Oct 21 '22 20:10 gadomski

Codecov Report

Base: 85.50% // Head: 85.87% // Increases project coverage by +0.37% :tada:

Coverage data is based on head (5a67ddf) compared to base (23b570d). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #337      +/-   ##
==========================================
+ Coverage   85.50%   85.87%   +0.37%     
==========================================
  Files          11       11              
  Lines         800      800              
==========================================
+ Hits          684      687       +3     
+ Misses        116      113       -3     
Impacted Files Coverage Δ
pystac_client/item_search.py 92.97% <100.00%> (ø)
pystac_client/stac_api_io.py 89.18% <100.00%> (+2.70%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov-commenter avatar Oct 21 '22 20:10 codecov-commenter

I'm converting this to draft because I'd like to test it on a real-world server implementation first. For https://github.com/stac-utils/pystac-client/pull/362, I couldn't url encode the parameter when testing against the Planetary Computer. I'd like to know whether that's a server bug, which will determine whether we url encode the intersects parameter in this PR.

gadomski avatar Nov 28 '22 18:11 gadomski

To keep this moving, and in liu of a production system to run a real-world test against, I tested this against https://github.com/stac-utils/stac-fastapi/commit/97b091127e41b24a600cdbc49466074562f554ae with the following patch:

diff --git a/stac_fastapi/pgstac/stac_fastapi/pgstac/core.py b/stac_fastapi/pgstac/stac_fastapi/pgstac/core.py
index 9b194f7..68d7620 100644
--- a/stac_fastapi/pgstac/stac_fastapi/pgstac/core.py
+++ b/stac_fastapi/pgstac/stac_fastapi/pgstac/core.py
@@ -358,6 +358,9 @@ class CoreCrudClient(AsyncBaseCoreClient):
         """
         request = kwargs["request"]
         query_params = str(request.query_params)
+        import json
+        import urllib.parse
+        print(json.loads(urllib.parse.unquote_plus(kwargs["intersects"][0])))
 
         # Kludgy fix because using factory does not allow alias for filter-lang
         if filter_lang is None:

I then ran:

cd stac-fastapi && docker-compose up &
cd ../pystac-client && stac-client search --method GET http://localhost:8082 --intersects '{"type":"Point","coordinates":[-105.1019,40.1672]}'

Relevant output:

stac-fastapi-pgstac                   | {'type': 'Point', 'coordinates': [-105.1019, 40.1672]}
stac-fastapi-pgstac                   | INFO:     172.19.0.1:56374 - "GET /search?limit=100&intersects=%257B%2522type%2522%253A%2522Point%2522%252C%2522coordinates%2522%253A%255B-105.1019%252C40.1672%255D%257D HTTP/1.1" 200 OK

I'm going to call that good for now, and mark this PR as ready-to-review.

gadomski avatar Dec 13 '22 16:12 gadomski

I tried it against three instances:

# url = "https://earth-search.aws.element84.com/v1"
# collection="sentinel-2-l2a"

# url = "https://planetarycomputer.microsoft.com/api/stac/v1"
# collection="sentinel-2-l2a"

url = "https://tamn.snapplanet.io"
collection="S2"

with 0.5.1, I got:

  • Earth Search: 500 internal error (not clear what the problem is)
  • PC: never stops (so likely the next link is just on repeat)
  • Snapplanet: works

with this change, I got:

  • Earth Search: 400 {"code":"BadRequest","description":"Invalid GeoJSON geometry"}
  • PC: never stops
  • Snapplanet: error

philvarner avatar Dec 14 '22 14:12 philvarner

with this change, I got: Earth Search: 400 {"code":"BadRequest","description":"Invalid GeoJSON geometry"} Snapplanet: error

This makes me think that those servers aren't unquoting the param before parsing it as JSON. Per https://github.com/stac-utils/pystac-client/pull/362#issuecomment-1329717159 this is technically a bug, but since

  1. we don't have control over those servers
  2. they're commonly used
  3. there's probably other servers that don't unquote the param

this makes me think we shouldn't url-quote the param here or in #362. @philvarner thoughts?

gadomski avatar Dec 15 '22 19:12 gadomski

Closing as OBE (https://github.com/stac-utils/pystac-client/pull/362#discussion_r1156249806).

gadomski avatar Apr 03 '23 17:04 gadomski