pgstac icon indicating copy to clipboard operation
pgstac copied to clipboard

Enable Strict Mode for Queryables / Search

Open bitner opened this issue 1 year ago • 9 comments

Enable mode to only allow search properties that are in the queryables table. When a queryable has a range or enum, return an error when a search requests values outside of that range.

bitner avatar Feb 07 '23 18:02 bitner

Having this feature would be greatly appreciated: https://github.com/stac-utils/stac-fastapi-pgstac/issues/61

drnextgis avatar Sep 23 '23 17:09 drnextgis

@bitner I would like to dive into implementing this feature. I've conducted some initial testing, which you can review in this commit. However, I have a few uncertainties. Firstly, is this the appropriate place to put this functionality?

I did some tests with pure SQL and with stac-fastapi. At first glance it seems to be working.

SQL

postgis=# SELECT cql2_query('{"op": "gte", "args": [{"property": "eo:cloud_cover"}, 0.59]}');
                            cql2_query                             
-------------------------------------------------------------------
 to_int(content->'properties'->'eo:cloud_cover') >= to_int('0.59')
(1 row)

postgis=# SELECT cql2_query('{"op": "gte", "args": [{"property": "xxx"}, 0.59]}');
ERROR:  Term xxx is not found in queryables.
CONTEXT:  PL/pgSQL function cql2_query(jsonb,text) line 110 at RAISE

postgis=# SELECT cql2_query('{"op": "gte", "args": [{"property": "gsd"}, 0.59]}');
ERROR:  Term gsd is not found in queryables.
CONTEXT:  PL/pgSQL function cql2_query(jsonb,text) line 110 at RAISE

STAC API

Using http requests it also seems to be working.

$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "eo:cloud_cover"}, 0.59]}}' | http http://0.0.0.0:8082/search | jq .context
{
  "limit": 10,
  "returned": 0
}
$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "xxx"}, 0.59]}}' | http http://0.0.0.0:8082/search | jq
{
  "code": "RaiseError",
  "description": "Term xxx is not found in queryables."
}

But this example is kind of weird because gsd doesn't exist in queryables (but unlike xxx it exists in STAC items):

$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "gsd"}, 0.59]}}' | http http://0.0.0.0:8082/search | jq .context
{
  "limit": 10,
  "returned": 10
}

If I modify the value, it starts functioning as anticipated :thinking:

$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "gsd"}, 0.0]}}' | http http://0.0.0.0:8082/search
{
  "code": "RaiseError",
  "description": "Term gsd is not found in queryables."
}

drnextgis avatar Sep 24 '23 20:09 drnextgis

Alright, the mystery has been unraveled. It turns out that the issue stemmed from the fact that the search query was stored in the searches table cache. Once it was removed from there, everything functioned as anticipated:

$ echo '{"filter-lang": "cql2-json", "filter": {"op": "gte", "args": [{"property": "gsd"}, 0.59]}}' | http http://0.0.0.0:8082/search
{
  "code": "RaiseError",
  "description": "Term gsd is not found in queryables."
}

So, please confirm if I'm headed in the right direction, and if so, I'll proceed with crafting a PR.

drnextgis avatar Sep 25 '23 18:09 drnextgis

Hey @drnextgis I'll look through this and get back to you later today

bitner avatar Sep 26 '23 14:09 bitner

@drnextgis I think that your approach looks like a good place to put this. Let me know if there is anything that I can do to help you get a PR up.

With 0.8.0 I tried to simplify things a bit to allow PRs with "unreleased" rather than version tagged migrations, so you will not need to create a new version tag in order to get the tests to run any more, but you do need to run scripts/stageversion which will create the database migrations for you.

Do make sure to add some tests either in the basic sql tests (this is just a sql file that gets run and another file that is just the expected output from running that sql file) or using the pgtap tests.

Let me know if there is anything that you'd like to get on a call to pair with, I'd be expected to help get more contributors!

bitner avatar Sep 28 '23 14:09 bitner

I'd look forward to this feature as well. Just want to make sure we're considering it an optional mode (defaulted to off) configured in pgstac.settings so that searches work before users can get their queryables configured.

mmcfarland avatar Sep 28 '23 15:09 mmcfarland

@mmcfarland @drnextgis yes, we should set this as a configuration parameter that should also control whether queryables advertises "additional_fields":true or not.

bitner avatar Sep 28 '23 15:09 bitner

Thank you for the guidance @bitner! I've prepared a PR, please review it: https://github.com/stac-utils/pgstac/pull/216

drnextgis avatar Sep 28 '23 23:09 drnextgis

@bitner, after reviewing the documentation, I've come to understand that that the queryables schema serves an informative role and isn't intended for validation purposes. Additionally, the validation process is contingent on the specific operator being used. For instance, in the case of the != operator, none of the checks you mentioned (range or enum) can be executed.

drnextgis avatar Oct 04 '23 22:10 drnextgis