pybids icon indicating copy to clipboard operation
pybids copied to clipboard

modeling: Add transformation history to `BIDSStatsModelsNodeOutput`

Open adelavega opened this issue 3 years ago • 3 comments

Closes #900

  • Add argument keep_history to TransformerManager, which adds a history_ attribute. This is a list of snapshots ofBIDSVariableCollections after every transformation is applied (plus the original input collection as the 0th item).
  • Add argument collection_history to BIDSStatsModelsNodeOutput which take the Transformation history_ and saves it as BIDSVariableCollections.coll_hist.

I also added the following (which I'm happy to put in a separate PR, but I think should be fairly uncontroversial):

  • Simplify logic by removing create_model_spec function (which is only used by BIDSStatsModelsNodeOutput.__init__ and added comments.

adelavega avatar Oct 10 '22 19:10 adelavega

Codecov Report

Base: 86.38% // Head: 86.24% // Decreases project coverage by -0.13% :warning:

Coverage data is based on head (70eacb8) compared to base (c411a86). Patch coverage: 74.19% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #905      +/-   ##
==========================================
- Coverage   86.38%   86.24%   -0.14%     
==========================================
  Files          32       32              
  Lines        3885     3904      +19     
  Branches      942      947       +5     
==========================================
+ Hits         3356     3367      +11     
- Misses        342      346       +4     
- Partials      187      191       +4     
Impacted Files Coverage Δ
bids/modeling/model_spec.py 81.53% <ø> (-0.56%) :arrow_down:
bids/modeling/statsmodels.py 85.98% <71.42%> (-0.95%) :arrow_down:
bids/modeling/transformations/base.py 87.39% <76.47%> (-0.95%) :arrow_down:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Oct 10 '22 19:10 codecov[bot]

By default, when running Nodes, the transformation history is now stored within the outputs.

For example:

specs = root_node.run(group_by=root_node.group_by, force_dense=False)
specs[0].coll_hist
>>> {'run': [<bids.variables.collections.BIDSRunVariableCollection at 0x7f342d6767c0>,
  <bids.variables.collections.BIDSRunVariableCollection at 0x7f342da18b20>,
  <bids.variables.collections.BIDSRunVariableCollection at 0x7f342d676790>,
  <bids.variables.collections.BIDSRunVariableCollection at 0x7f342d6768b0>]}

The user can then inspect these variables as they passed through the Transformation as they wish.

Main places for improvement here would be to have a more useful container for these data. Perhaps a lightweight object that indicates which transformations were applied, the arguments, and the output.

adelavega avatar Oct 10 '22 21:10 adelavega

I've revised this to use a lightweight TransformationHistory dataclass object to keep track of the collections, including their level to allow them to be stored in a flat dictionary.

I also decided to omit the initial collection as this is redundant with what is found in the original Node.

Here's an example:

[TransformationOutput(index=0, output=<bids.variables.collections.BIDSRunVariableCollection object at 0x7f8123a6ce50>, transformation_name='Threshold', transformation_kwargs={'binarize': True, 'output': ['trials']}, input_cols=['gain'], level='run'),
 TransformationOutput(index=1, output=<bids.variables.collections.BIDSRunVariableCollection object at 0x7f8123d92100>, transformation_name='Scale', transformation_kwargs={'demean': True, 'rescale': False, 'output': ['gain', 'loss', 'demeaned_RT']}, input_cols=['gain', 'loss', 'RT'], level='run'),
 TransformationOutput(index=2, output=<bids.variables.collections.BIDSRunVariableCollection object at 0x7f8123b74940>, transformation_name='Convolve', transformation_kwargs={'model': 'spm'}, input_cols=['trials', 'gain', 'loss', 'demeaned_RT'], level='run')]

adelavega avatar Oct 10 '22 22:10 adelavega

Okay, this is done. I'll wait to see if @jmumford has any final comments, but if not this is ready to merge.

adelavega avatar Oct 20 '22 18:10 adelavega

@adelavega @jmumford Is this good to go?

effigies avatar Nov 02 '22 18:11 effigies

Oh-- I think so, for some reason I thought we had already merged it.

If you give it a final approval, go ahead and merge.

adelavega avatar Nov 03 '22 01:11 adelavega