pynwb icon indicating copy to clipboard operation
pynwb copied to clipboard

Fix bad validation check for postion in ElectrodeGroup.__init__

Open oruebel opened this issue 2 years ago • 9 comments

Motivation

Fix #1768 ElectrodeGroup.position is a compound dataset of x,y,z positions, but ElectrodeGroup.__init__ validates it wrongly by enforcing that the array has 3 elements instead of checking that the elements have 3 elements. This PR updates ElectrodeGroup.__init__ to validate ElectrodeGroup.position correctly

How to test the behavior?

See the linked issue for details

Checklist

  • [x] Did you update CHANGELOG.md with your changes?
  • [X] Have you checked our Contributing document?
  • [X] Have you ensured the PR clearly describes the problem and the solution?
  • [X] Is your contribution compliant with our coding style? This can be checked running flake8 from the source directory.
  • [X] Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • [X] Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

oruebel avatar Aug 29 '23 10:08 oruebel

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.05%. Comparing base (f136410) to head (c7962f7).

Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #1770      +/-   ##
==========================================
+ Coverage   92.02%   92.05%   +0.03%     
==========================================
  Files          27       27              
  Lines        2620     2630      +10     
  Branches      685      689       +4     
==========================================
+ Hits         2411     2421      +10     
  Misses        139      139              
  Partials       70       70              
Flag Coverage Δ
integration 73.15% <41.66%> (-0.09%) :arrow_down:
unit 83.95% <91.66%> (+0.02%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Aug 29 '23 10:08 codecov[bot]

Just some detective work for background:

  • It seems that ElectrodeGroup.position was added in NWB v2.2 in #1146. I believe this bug has existed since the field was added.
  • Looking at https://github.com/NeurodataWithoutBorders/nwb-schema/commit/59db28ee96cfe63f7304789cc879cd94a75632ce it seems that ElectrodeGroup.position was first added as an array of length 3 (which is what the current check is looking for) and then changed to a compound data type. I.e., my guess is that PyNWB didn't get updated when the change to compound happened, and the unit tests did not fully cover this field.

Given that it has take 3 years for this issue to surface, it seems that ElectrodeGroup.position is probably rarely used.

oruebel avatar Aug 29 '23 12:08 oruebel

@oruebel Any updates on this PR?

CodyCBakerPhD avatar Sep 28 '23 17:09 CodyCBakerPhD

@stephprince could you take a look at this PR. I think this is essentially complete but needs some final attention to get over the finish line.

oruebel avatar Jan 25 '24 04:01 oruebel

@stephprince do you have a chance to take a look at this? thanks!

rly avatar Mar 05 '24 15:03 rly

Yes sorry for the delay I'll take a look!

stephprince avatar Mar 05 '24 16:03 stephprince

@rly this is ready to review

stephprince avatar Mar 12 '24 15:03 stephprince

Ok, after doing a full review of the issue and PR, I am also confused as to why ElectrodeGroup.position is allowed to be a list of tuples. I think @oruebel may be mistaken. ElectrodeGroup.position should be a scalar dataset with compound dtype (x, y, z). @oruebel can you review?

rly avatar Mar 13 '24 17:03 rly

I think @oruebel may be mistaken. ElectrodeGroup.position should be a scalar dataset with compound dtype (x, y, z)

The schema here says:

  datasets:
  - name: position
    dtype:
    - name: x
      dtype: float32
      doc: x coordinate
    - name: y
      dtype: float32
      doc: y coordinate
    - name: z
      dtype: float32
      doc: z coordinate
    doc: stereotaxic or common framework coordinates
    quantity: '?'

I.e., position is a compound dataset but the shape is not specified. Looking at the schema language docs here, I believe this means that shape: null is assumed so that it should be scalar. So yes, it seems that this should indeed be a scalar with a compound data type. So this PR should be updated accordingly.

Question, how does the input for scalar compound datasets need to look like? Looking at the source issue #1768 setting:

  • position=np.array([1., 2., 3.]) : results in an error due to the validation in ElectrodeGroup
  • position=(1., 2., 3.) : is being written as [(1., 2., 3.) (1., 2., 3.) (1., 2., 3.)]
  • position=[1., 2., 3.] : is being written as [(1., 1., 1.) (2., 2., 2.) (3., 3., 3.)]

Specifying as [(1, 2, 3), ]) as currently on the PR fixes the bad write. Is position=[(1, 2, 3), ]) the right form and we need to updated the validation to check that only a single tuple is given or is position=(1., 2., 3.) the correct form and we need to dig deeper into write?

oruebel avatar May 02 '24 23:05 oruebel