tiled
tiled copied to clipboard
Improve documentation of search query HTTP API
Following up on conversation in #367, we want documented examples of search query URLs
FYI: some new examples in this notebook.
@prjemian : I'm having some difficulty with the concepts of the "filter" and "sort" parameters for the search query.
Does the sort=time
key-value pair automagically find every (sub-)field named "time" in the metadata? In this case does it find and compare the "time" field within the run's "start" document, the "time" field within the run's "stop" document, both, or whichever it finds first for the node?
Does the filter get applied to the value of the key specified by sort=KEY
? Or is there something special about the filter ['time_range'] that knows to look for a time value in some field?
If the search is finding a (non-unique) field named "time", can I be more specific in the query? Maybe something like sort=start.time
?
...and can sort direction (ASC vs DESC) be specified for pagination in the HTTP API?
I realize I'm being a pain here. I'm also happy to help with testing and updating documentation if you can point me the right direction. :)
No worries at all. I'm in all-day meetings yesterday and today so not quite keeping up but happy for the feedback on what people are interested in having documented.
Yes, you can sort on nested keys by using dots, as in sample.element
. Currently, you can only search and sort on keys in the start document and time
means "time
in the start document". There is a path to expanding that in the future.
The direction is ascending by default; prepend a minus sign -
to get descending.
Much appreciated!
Yes, you can sort on nested keys by using dots, as in sample.element. Currently, you can only search and sort on keys in the start document and time means "time in the start document". There is a path to expanding that in the future.
Does this mean that search functionality is currently limited to bluesky runs? Or does it search the first / outer level of metadata available from file directories served by tiled?
I'm working by trial-and-error at the moment with a tiled instance that is serving data files from a beamline directory.
@pshafer-als I, too, am trying to learn more about accessing info as a tiled client. For one of our projects (to join various metadata databases at ANL), the interest is how to get the metadata for a bluesky measurement. To help me learn how to access, I used code from tiled.client
to make queries of the tiled server. Then, I inspected the server's logs for the specific URI parts from each query. Sometimes, a client query used more than one URI to get its information from the server.
As @danielballan said above, the sort=time
part of the URI directs the tiled server that once it has a list of Node
objects, sort that by the floating point time
key (found in the start
document's metadata) in ascending order. This key is one of the two (?) keys expected in every start
document, so we assume it will always be there. @danielballan also said that sort=-time
will sort in descending order.
With our metadata project, one requirement was to avoid loading additional libraries, such as tiled.client
. In our local project, we agreed that the requests package is preferred over urllib.request
from the Python Standard Library.
We learn the URIs used by tiled.client
, then remake those queries by constructing URIs which are queried via r = requests.get(uri).json()
, then use r
, a dictionary with the query results from the tiled server.
So, it's the filter
items in the URI that actually do the "searching". Here, filter seems more appropriate since we are taking a large list of runs (the entire catalog) and selecting those runs which satisfy the file we present. Gradually, I will comprehend the syntax enough to compose URI queries independently.
I used code from tiled.client to make queries of the tiled server. Then, I inspected the server's logs for the specific URI parts from each query. Sometimes, a client query used more than one URI to get its information from the server.
That's a good suggestion!
The filter[time_range]
part of the URI is a specialized query for a common search. I'm sure it is possible to make [ge]
and [lt]
queries on the actual start.time
field but this is less prone to error and easier to read.
I'm adding notes here for future me, or someone in a similar situation. Tiled has a mechanism for registering custom queries.
See:
I used code from tiled.client to make queries of the tiled server. Then, I inspected the server's logs for the specific URI parts from each query. Sometimes, a client query used more than one URI to get its information from the server.
That's a good suggestion!
This has been tremendously helpful. Thanks @prjemian !