databroker icon indicating copy to clipboard operation
databroker copied to clipboard

Missing reverse compatibility with V1

Open ambarb opened this issue 3 years ago • 2 comments

Some features available in earlier version of data broker V1 methodology are missing now in databroker version 1.2.3. Whether or not this was a design choice is not clear to me.

Expected Behavior

  1. headers = db[123,125] ; headers[0].table() and other such "batch header and table processing.
  2. tbl_bl = h.table('baseline', fields = ['nanop']) There are no things in table stream named "nanop", but nanop is the parent of may things like: nanop_tx, nanop_tx_setpoint, and so on.
  3. header search (via db()) works, but something is wrong with how headers are handled.

BEFORE ITEM 3 broke:

headers = db(since=_my_start_time, until = _my_stop_time)
for h in headers:
    print(h.start["scan_id"])
    t = h.table()

AFTER ITEM 3 broke:

headers = db(since=_my_start_time, until = _my_stop_time)
for h in headers:
    print(h.start["scan_id"])
    h = db[h.start["scan_id"]]
    t = h.table()

Current Behavior

These things don't work for databroker version 1.2.3 and raise exceptions.

Your Environment

testing against data broker version 0.12.

  • Some functionality of item 1 was lost prior to this version (headers.table() ) - for instance we could just get an average dark image from series of dark image scan uids without having any special functions. it just worked.
  • It seems that item 2 doesn't actually return anything in this version. I am unsure if it ever "worked", but it would be a nice enhancement so people can just extract all the children of a diffractometer or whatever complicated ophyd device they are after without extracting all of the baseline stream; which can be extensive.

ambarb avatar Jun 08 '21 16:06 ambarb

@ambarb's report is for v1.2.3, but this same issue applies to the tiled-refactor (upstream) branch, and the fix should be targeted there, because that is the "secure-able databroker" that will be running ~everywhere ASAP. I expect we will recommend that all users, including users external to NSLS-II, use Databroker v2.0.0 when it is ready, using either databroker.v1 and the new tiled API, and falling back to databroker.v0 if there are serious problems.

Referring to the items at the top of the OP, I have fixed Items 1 and 3 in the tiled-refactor branch. Notes to whoever fixes Item 2:

Databroker v0 had a feature, which with hindsight I might call an anti-feature, where table(fields=...) returns all columns that regex-matched the given fields. In the OP's example, fields=['nanop'] this has the effect of matching all the children of the nanop Device. The matching rules are complex because they match not only event['data'] keys but also search descriptor, stop, and start document with some defined precedence and "project" this scalar values into columns.

With hindsight, I would advocate implementing this in a higher layer so that it becomes an opt-in feature. Therefore, at least for now, I think we should implement it in databroker.v1:Broker (for full back-compat with databroker.v0) but not in the underlying object databroker.mongo_normalized:Catalog.

Take a look at databroker.v1:Broker.get_documents which does already implement this and uses a utility function adapted from databroker.v0. That code can probably be reused to make it work in databroker.v1:Broker.get_table.

danielballan avatar Jun 11 '21 12:06 danielballan

As an additional hint @tacaswell this is where Broker.get_documents implements this.

https://github.com/bluesky/databroker/blob/970e9148dfab5e77101d40f059ecb30d064eac81/databroker/v1.py#L383-L435

danielballan avatar Jun 12 '21 12:06 danielballan