Revived proposal for representing BlueskyRuns in Tiled
Brief summary of discussions with @padraic-shafer, @genematx, and @tacaswell, drawing on earlier discussions with @dylanmcreynolds and @whs92.
Goals:
- Provide a stable URL to a given column of data, regardless of whether it is in-line in the Event documents or external.
- Data in a given stream should be presented to the user in a flat namespace, not placing in the URL path the details about internal versus external storage.
- Server should reveal to clients how the data is grouped into tables and arrays underneath, facilitating efficient chunked access (i.e. not separate requests for each column).
- It should be possible to add "views", such as simplified/flatter views, of the data in a BlueskyRun. (NeXus does something similar.) A canonical example is re-mapping a mapping scan (1D in the canonical Bluesky representation) into a more scientifically relevant N-dimensional view (reshaping).
- Lay track for potentially giving a way to access auxiliary information (e.g. proxied from Archiver Appliance).
Revive https://github.com/bluesky/tiled/pull/668 which proposed two ideas.
- A new structure family named ~union~ "consolidated" which let us represent a Bluesky Stream, backed by a combination of tables and arrays, nicely
- A concept of "views", alternate (simplified...) layouts of the canonical data
Both suggestions were developed and viewed positively at the time, but set aside back in March to make progress on other things. Now, we propose to implement both.
In URLs:
# Bluesky stuff...
.../<uuid>/streams/<stream name> # "consolidated" structure family
.../<uuid>/streams/<stream name>?part=<part> # table of columns stored together, or array
.../<uuid>/streams/<stream name>/<key> # array
.../<uuid>/config/<stream name>/<obj name> # table of configuration readings
# New concept: simplified/rearranged "views":
.../<uuid>/views/<view name>/... # server-side registerable views
# New concept: auxiliary info (e.g. from EPICS Archiver Appliance)
../<uuid>/auxiliary/<aux name>/... # information from outside Tiled
In Python API:
run.streams["primary"].read() # xarray.Dataset
run.streams["primary"]["image"] # array
run.config["primary"]["fccd"].read() # table of configuration readings
run.views["simple"].read()
# and equivalently...
run.read(view="simple") # might even be the default for run.read(), which currently raises...
run.aux["archiver"]["PV..."].read() # array
Nice summary. I haven't really thought about this auxiliary concept. It's fascinating to me but feels lower priority than the other endpoints?
Yes, we fully agree. New concept that came up this week, and included only to buttress the idea that having a namespace in that position (streams, config, views, aux) open up multiple potentially interesting paths.
I think the ability to also easily access archive data is amazing.
I think your proposed namespace makes sense.
We had the ability to enrich a bluesky run with additional (synthetic) streams from the archiver appliance in databroker v0: https://github.com/bluesky/databroker/blob/main/databroker/eventsource/archiver.py .
I like that this scopes the auxiliary data (e.g. archiver) inside the BlueskyRun (good!) but outside the streams. I think the distinction between "a part of the original document stream" and "stapled on later for convenience" is worth surfacing at this level.
This looks interesting! Could the views concept be considered analogous to nexus application definitions? If so it opens the door to things we struggled to do in nexus-land, such as a stricter schema.
@callumforrester that was definitely something we were thinking when we were talking about it a while back. I think of views as sort of a replacement for Databroker's projection/projector facility, where you could map to one or more ontologies. Views have the advantage of being a little deeper in the data model. Projection really just affected the resulting xarray that Databroker produced.
some similar logic was accomplished before for reading Nexus files in https://github.com/DiamondLightSource/hdfmap