cf-xarray icon indicating copy to clipboard operation
cf-xarray copied to clipboard

Add attribute tracking functionality

Open dcherian opened this issue 2 years ago • 3 comments

Simple test of https://github.com/pydata/xarray/pull/5668 (cc @keewis)

This just adds the cell_methods attribute for reductions and a placeholder for tracking history.

import cf_xarray as cfxr
import xarray as xr

da = xr.DataArray([1, 2, 3], dims="time")
with xr.set_options(keep_attrs=cfxr.track_cf_attributes(cell_methods=True)):
    print(da.mean("time").attrs)
{'cell_methods': 'dim_name: mean ', 'history': None}

There is one top-level function

cfxr.tracking.track_cf_attributes(cell_methods=True, history=True)

This returns a partial function that can be provided to keep_attrs i.e. it expects the args attrs, context. I think we could add things like provenance=True etc. in the future.

Right now only the private _tracker function satisfies the contract expected by keep_attrs. Alternately, we could provide track_cell_methods track_history etc. that satisfy the contract so that users can instead do for fine control

xr.set_options(keep_attrs=cfxr.tracking.track_history)

I'm interested in opinions on whether we should do this second approach instead.

cc @huard @aulemahal

dcherian avatar Aug 03 '21 17:08 dcherian

Codecov Report

Merging #253 (e972dde) into main (16d8c39) will decrease coverage by 0.72%. The diff coverage is 40.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #253      +/-   ##
==========================================
- Coverage   96.52%   95.79%   -0.73%     
==========================================
  Files          16       17       +1     
  Lines        1926     1951      +25     
==========================================
+ Hits         1859     1869      +10     
- Misses         67       82      +15     
Flag Coverage Δ
unittests 95.79% <40.00%> (-0.73%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cf_xarray/tracking.py 34.78% <34.78%> (ø)
cf_xarray/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 16d8c39...e972dde. Read the comment docs.

codecov[bot] avatar Aug 03 '21 17:08 codecov[bot]

While I like the idea of xr.set_options(keep_attrs=cfxr.tracking.track_history), it becomes complex when you want to track both! And in the hypothetical case where we add even more things to track, it'll be a nightmare. So I vote for the single function option!

Otherwise, this functionality is very interesting!

aulemahal avatar Aug 03 '21 17:08 aulemahal

Right now only the private _tracker function satisfies the contract expected by keep_attrs.

Actually this is wrong! Both add_history and add_cell_methods satisfy the contract, so this was a useless question. I apologize for the noise.

dcherian avatar Aug 03 '21 22:08 dcherian