pynwb
pynwb copied to clipboard
Fix bad validation check for postion in ElectrodeGroup.__init__
Motivation
Fix #1768 ElectrodeGroup.position is a compound dataset of x,y,z positions, but ElectrodeGroup.__init__ validates it wrongly by enforcing that the array has 3 elements instead of checking that the elements have 3 elements. This PR updates ElectrodeGroup.__init__ to validate ElectrodeGroup.position correctly
How to test the behavior?
See the linked issue for details
Checklist
- [x] Did you update CHANGELOG.md with your changes?
- [X] Have you checked our Contributing document?
- [X] Have you ensured the PR clearly describes the problem and the solution?
- [X] Is your contribution compliant with our coding style? This can be checked running
flake8from the source directory. - [X] Have you checked to ensure that there aren't other open Pull Requests for the same change?
- [X] Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 92.05%. Comparing base (
f136410) to head (c7962f7).
Additional details and impacted files
@@ Coverage Diff @@
## dev #1770 +/- ##
==========================================
+ Coverage 92.02% 92.05% +0.03%
==========================================
Files 27 27
Lines 2620 2630 +10
Branches 685 689 +4
==========================================
+ Hits 2411 2421 +10
Misses 139 139
Partials 70 70
| Flag | Coverage Δ | |
|---|---|---|
| integration | 73.15% <41.66%> (-0.09%) |
:arrow_down: |
| unit | 83.95% <91.66%> (+0.02%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Just some detective work for background:
- It seems that
ElectrodeGroup.positionwas added in NWB v2.2 in #1146. I believe this bug has existed since the field was added. - Looking at https://github.com/NeurodataWithoutBorders/nwb-schema/commit/59db28ee96cfe63f7304789cc879cd94a75632ce it seems that
ElectrodeGroup.positionwas first added as an array of length 3 (which is what the current check is looking for) and then changed to a compound data type. I.e., my guess is that PyNWB didn't get updated when the change to compound happened, and the unit tests did not fully cover this field.
Given that it has take 3 years for this issue to surface, it seems that ElectrodeGroup.position is probably rarely used.
@oruebel Any updates on this PR?
@stephprince could you take a look at this PR. I think this is essentially complete but needs some final attention to get over the finish line.
@stephprince do you have a chance to take a look at this? thanks!
Yes sorry for the delay I'll take a look!
@rly this is ready to review
Ok, after doing a full review of the issue and PR, I am also confused as to why ElectrodeGroup.position is allowed to be a list of tuples. I think @oruebel may be mistaken. ElectrodeGroup.position should be a scalar dataset with compound dtype (x, y, z). @oruebel can you review?
I think @oruebel may be mistaken.
ElectrodeGroup.positionshould be a scalar dataset with compound dtype (x, y, z)
The schema here says:
datasets:
- name: position
dtype:
- name: x
dtype: float32
doc: x coordinate
- name: y
dtype: float32
doc: y coordinate
- name: z
dtype: float32
doc: z coordinate
doc: stereotaxic or common framework coordinates
quantity: '?'
I.e., position is a compound dataset but the shape is not specified. Looking at the schema language docs here, I believe this means that shape: null is assumed so that it should be scalar. So yes, it seems that this should indeed be a scalar with a compound data type. So this PR should be updated accordingly.
Question, how does the input for scalar compound datasets need to look like? Looking at the source issue #1768 setting:
position=np.array([1., 2., 3.]): results in an error due to the validation in ElectrodeGroupposition=(1., 2., 3.): is being written as[(1., 2., 3.) (1., 2., 3.) (1., 2., 3.)]position=[1., 2., 3.]: is being written as[(1., 1., 1.) (2., 2., 2.) (3., 3., 3.)]
Specifying as [(1, 2, 3), ]) as currently on the PR fixes the bad write. Is position=[(1, 2, 3), ]) the right form and we need to updated the validation to check that only a single tuple is given or is position=(1., 2., 3.) the correct form and we need to dig deeper into write?