pynwb
pynwb copied to clipboard
[Bug]: Writing ElectrodeGroup position saves incorrectly modified data to file
What happened?
There seem to be some inconsistencies in writing ElectrodeGroup position to a file depending on the input type:
- For
np.ndarray
inputs, the__init__
function checks whether position was provided usingif position
instead ofif position is not None
, which leads to the usualThe truth value of an array with more than one element is ambiguous
error - For tuple inputs, the saved
position
data seems to be duplicated into an array of tuples like[(x, y, z), (x, y, z), (x, y, z)]
- For list inputs, the saved
position
data seems to be duplicated into a array of tuples like[(x, x, x), (y, y, y), (z, z, z)]
These last two things only happen on the write/read round trip, not before writing the file.
Steps to Reproduce
import numpy as np
import os
from pynwb import NWBFile, NWBHDF5IO
from uuid import uuid4
from datetime import datetime
nwbfile = NWBFile(
session_description="no description.",
session_start_time=datetime.now(),
identifier=str(uuid4()),
)
device = nwbfile.create_device(name="Device", description="no description.")
electrode_group = nwbfile.create_electrode_group(
name="ElectrodeGroup",
description="no description.",
device=device,
location="unknown",
# position=np.array([1., 2., 3.]),
position=(1., 2., 3.),
# position=[1., 2., 3.],
)
with NWBHDF5IO('test.nwb', 'w') as io:
io.write(nwbfile)
with NWBHDF5IO('test.nwb', 'r') as io:
read_nwbfile = io.read()
print(read_nwbfile.electrode_groups['ElectrodeGroup'])
print(read_nwbfile.electrode_groups['ElectrodeGroup'].position[()])
Traceback
# for position = np.array([1., 2., 3.])
Traceback (most recent call last):
File "/home/jovyan/scripts/test_write_electrode_group_position.py", line 16, in <module>
electrode_group = nwbfile.create_electrode_group(
File "/opt/conda/lib/python3.10/site-packages/hdmf/utils.py", line 644, in func_call
return func(args[0], **pargs)
File "/opt/conda/lib/python3.10/site-packages/hdmf/container.py", line 1044, in _func
ret = container_type(**kwargs)
File "/opt/conda/lib/python3.10/site-packages/hdmf/utils.py", line 644, in func_call
return func(args[0], **pargs)
File "/opt/conda/lib/python3.10/site-packages/pynwb/ecephys.py", line 33, in __init__
if args_to_set['position'] and len(args_to_set['position']) != 3:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
# for position=(1., 2., 3.)
ElectrodeGroup pynwb.ecephys.ElectrodeGroup at 0x140662980868848
Fields:
description: no description.
device: Device pynwb.device.Device at 0x140662980868752
Fields:
description: no description.
location: unknown
position: <hdmf.backends.hdf5.h5_utils.ContainerH5TableDataset object at 0x7feea701fb50>
[(1., 2., 3.) (1., 2., 3.) (1., 2., 3.)]
# for position=[1., 2., 3.]
ElectrodeGroup pynwb.ecephys.ElectrodeGroup at 0x140659597409008
Fields:
description: no description.
device: Device pynwb.device.Device at 0x140659597408912
Fields:
description: no description.
location: unknown
position: <hdmf.backends.hdf5.h5_utils.ContainerH5TableDataset object at 0x7feddd567b50>
[(1., 1., 1.) (2., 2., 2.) (3., 3., 3.)]
Operating System
Linux
Python Executable
Python
Python Version
3.10
Package Versions
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
- [X] Have you checked the Contributing document?
- [X] Have you ensured this bug was not already reported?
Thanks for reporting this bug.
Issue 1: Specify ElectrodeGroup.position
as a compound array
ElectrodeGroup.position
is actually a compound dataset:
What this means is that each element in the position
array is a (x,y,z)
coordinate. I.e., the correct way to specify the argument would be as either a list of tuples (where each tuple has 3 floats) or as a structured numpy array with a dtype consisting of three floats. I.e., the correct way to specify position should be position=[(1., 2., 3.), ]
or position=np.array([(1., 2., 3.,)], dtype= np.dtype([('x', float), ('y', float), ('z', float)])
.
The reason you are seeing values being "duplicated" is because position is created in the file in accordance with the schema as a compound dataset with a dtype consisting of three floats ((float,float,float)
). Each entry in [1., 2., 3.] is then interpreted as a separate position, and so the values are implicitly expanded to match the compound data type.
Issue 2: How ElectrodeGroup.__init__
checks the position
argument is wrong
As you mentioned, the error check in ElectrodeGroup.__init__
raises an error for the numpy array. However, beyond that, the check incorrectly checks for the length of the array to be 3. However, this is not the correct check. Instead we should check that the elements in the array (or the dtype) have length 3. I.e., fixing the example alone by setting position to position=[(1., 2., 3.), ]
will not fix the issue. I'll submit a PR shortly to address this issue.
#1770 should fix the issue.
If you modify the example code to create the ElectrodeGroup as follows:
electrode_group = nwbfile.create_electrode_group(
name="ElectrodeGroup",
description="no description.",
device=device,
location="unknown",
position=[(1., 2., 3.),]
)
then this should work now with the fix on #1770.
Perfect, thank you for the explanation!