tiled icon indicating copy to clipboard operation
tiled copied to clipboard

Improve documentation of search query HTTP API

Open danielballan opened this issue 2 years ago • 5 comments

Following up on conversation in #367, we want documented examples of search query URLs

danielballan avatar Nov 30 '22 20:11 danielballan

FYI: some new examples in this notebook.

prjemian avatar Dec 13 '22 18:12 prjemian

@prjemian : I'm having some difficulty with the concepts of the "filter" and "sort" parameters for the search query.

Does the sort=time key-value pair automagically find every (sub-)field named "time" in the metadata? In this case does it find and compare the "time" field within the run's "start" document, the "time" field within the run's "stop" document, both, or whichever it finds first for the node?

Does the filter get applied to the value of the key specified by sort=KEY? Or is there something special about the filter ['time_range'] that knows to look for a time value in some field?

padraic-shafer avatar Dec 14 '22 01:12 padraic-shafer

If the search is finding a (non-unique) field named "time", can I be more specific in the query? Maybe something like sort=start.time?

padraic-shafer avatar Dec 14 '22 01:12 padraic-shafer

...and can sort direction (ASC vs DESC) be specified for pagination in the HTTP API?

padraic-shafer avatar Dec 14 '22 01:12 padraic-shafer

I realize I'm being a pain here. I'm also happy to help with testing and updating documentation if you can point me the right direction. :)

padraic-shafer avatar Dec 14 '22 01:12 padraic-shafer

No worries at all. I'm in all-day meetings yesterday and today so not quite keeping up but happy for the feedback on what people are interested in having documented.

Yes, you can sort on nested keys by using dots, as in sample.element. Currently, you can only search and sort on keys in the start document and time means "time in the start document". There is a path to expanding that in the future.

The direction is ascending by default; prepend a minus sign - to get descending.

danielballan avatar Dec 14 '22 19:12 danielballan

Much appreciated!

Yes, you can sort on nested keys by using dots, as in sample.element. Currently, you can only search and sort on keys in the start document and time means "time in the start document". There is a path to expanding that in the future.

Does this mean that search functionality is currently limited to bluesky runs? Or does it search the first / outer level of metadata available from file directories served by tiled?

I'm working by trial-and-error at the moment with a tiled instance that is serving data files from a beamline directory.

padraic-shafer avatar Dec 14 '22 21:12 padraic-shafer

@pshafer-als I, too, am trying to learn more about accessing info as a tiled client. For one of our projects (to join various metadata databases at ANL), the interest is how to get the metadata for a bluesky measurement. To help me learn how to access, I used code from tiled.client to make queries of the tiled server. Then, I inspected the server's logs for the specific URI parts from each query. Sometimes, a client query used more than one URI to get its information from the server.

As @danielballan said above, the sort=time part of the URI directs the tiled server that once it has a list of Node objects, sort that by the floating point time key (found in the start document's metadata) in ascending order. This key is one of the two (?) keys expected in every start document, so we assume it will always be there. @danielballan also said that sort=-time will sort in descending order.

prjemian avatar Dec 14 '22 21:12 prjemian

With our metadata project, one requirement was to avoid loading additional libraries, such as tiled.client. In our local project, we agreed that the requests package is preferred over urllib.request from the Python Standard Library.

We learn the URIs used by tiled.client, then remake those queries by constructing URIs which are queried via r = requests.get(uri).json(), then use r, a dictionary with the query results from the tiled server.

So, it's the filter items in the URI that actually do the "searching". Here, filter seems more appropriate since we are taking a large list of runs (the entire catalog) and selecting those runs which satisfy the file we present. Gradually, I will comprehend the syntax enough to compose URI queries independently.

prjemian avatar Dec 14 '22 22:12 prjemian

I used code from tiled.client to make queries of the tiled server. Then, I inspected the server's logs for the specific URI parts from each query. Sometimes, a client query used more than one URI to get its information from the server.

That's a good suggestion!

padraic-shafer avatar Dec 14 '22 22:12 padraic-shafer

The filter[time_range] part of the URI is a specialized query for a common search. I'm sure it is possible to make [ge] and [lt] queries on the actual start.time field but this is less prone to error and easier to read.

prjemian avatar Dec 14 '22 22:12 prjemian

I'm adding notes here for future me, or someone in a similar situation. Tiled has a mechanism for registering custom queries.

See:

padraic-shafer avatar Dec 16 '22 05:12 padraic-shafer

I used code from tiled.client to make queries of the tiled server. Then, I inspected the server's logs for the specific URI parts from each query. Sometimes, a client query used more than one URI to get its information from the server.

That's a good suggestion!

This has been tremendously helpful. Thanks @prjemian !

padraic-shafer avatar Dec 16 '22 05:12 padraic-shafer