tiled icon indicating copy to clipboard operation
tiled copied to clipboard

Revived proposal for representing BlueskyRuns in Tiled

Open danielballan opened this issue 1 year ago • 7 comments

Brief summary of discussions with @padraic-shafer, @genematx, and @tacaswell, drawing on earlier discussions with @dylanmcreynolds and @whs92.


Goals:

  • Provide a stable URL to a given column of data, regardless of whether it is in-line in the Event documents or external.
  • Data in a given stream should be presented to the user in a flat namespace, not placing in the URL path the details about internal versus external storage.
  • Server should reveal to clients how the data is grouped into tables and arrays underneath, facilitating efficient chunked access (i.e. not separate requests for each column).
  • It should be possible to add "views", such as simplified/flatter views, of the data in a BlueskyRun. (NeXus does something similar.) A canonical example is re-mapping a mapping scan (1D in the canonical Bluesky representation) into a more scientifically relevant N-dimensional view (reshaping).
  • Lay track for potentially giving a way to access auxiliary information (e.g. proxied from Archiver Appliance).

Revive https://github.com/bluesky/tiled/pull/668 which proposed two ideas.

  • A new structure family named ~union~ "consolidated" which let us represent a Bluesky Stream, backed by a combination of tables and arrays, nicely
  • A concept of "views", alternate (simplified...) layouts of the canonical data

Both suggestions were developed and viewed positively at the time, but set aside back in March to make progress on other things. Now, we propose to implement both.

In URLs:

# Bluesky stuff...
.../<uuid>/streams/<stream name>  # "consolidated" structure family
.../<uuid>/streams/<stream name>?part=<part>  # table of columns stored together, or array
.../<uuid>/streams/<stream name>/<key>  # array
.../<uuid>/config/<stream name>/<obj name>  # table of configuration readings

# New concept: simplified/rearranged "views":
.../<uuid>/views/<view name>/...  # server-side registerable views

# New concept: auxiliary info (e.g. from EPICS Archiver Appliance)
../<uuid>/auxiliary/<aux name>/...  # information from outside Tiled

In Python API:

run.streams["primary"].read()  # xarray.Dataset
run.streams["primary"]["image"]  # array
run.config["primary"]["fccd"].read()  # table of configuration readings

run.views["simple"].read()
# and equivalently...
run.read(view="simple")  # might even be the default for run.read(), which currently raises...
run.aux["archiver"]["PV..."].read()  # array

danielballan avatar Dec 06 '24 20:12 danielballan

Nice summary. I haven't really thought about this auxiliary concept. It's fascinating to me but feels lower priority than the other endpoints?

dylanmcreynolds avatar Dec 06 '24 20:12 dylanmcreynolds

Yes, we fully agree. New concept that came up this week, and included only to buttress the idea that having a namespace in that position (streams, config, views, aux) open up multiple potentially interesting paths.

danielballan avatar Dec 06 '24 21:12 danielballan

I think the ability to also easily access archive data is amazing.

I think your proposed namespace makes sense.

whs92 avatar Dec 10 '24 13:12 whs92

We had the ability to enrich a bluesky run with additional (synthetic) streams from the archiver appliance in databroker v0: https://github.com/bluesky/databroker/blob/main/databroker/eventsource/archiver.py .

tacaswell avatar Dec 10 '24 15:12 tacaswell

I like that this scopes the auxiliary data (e.g. archiver) inside the BlueskyRun (good!) but outside the streams. I think the distinction between "a part of the original document stream" and "stapled on later for convenience" is worth surfacing at this level.

danielballan avatar Dec 12 '24 14:12 danielballan

This looks interesting! Could the views concept be considered analogous to nexus application definitions? If so it opens the door to things we struggled to do in nexus-land, such as a stricter schema.

callumforrester avatar Dec 16 '24 10:12 callumforrester

@callumforrester that was definitely something we were thinking when we were talking about it a while back. I think of views as sort of a replacement for Databroker's projection/projector facility, where you could map to one or more ontologies. Views have the advantage of being a little deeper in the data model. Projection really just affected the resulting xarray that Databroker produced.

dylanmcreynolds avatar Dec 16 '24 16:12 dylanmcreynolds

some similar logic was accomplished before for reading Nexus files in https://github.com/DiamondLightSource/hdfmap

stan-dot avatar Mar 24 '25 16:03 stan-dot