pynwb icon indicating copy to clipboard operation
pynwb copied to clipboard

[Bug]: `epoch_tags` not written to disk

Open bjhardcastle opened this issue 1 year ago • 2 comments

What happened?

epoch_tags can be assigned to an object, but aren't written to disk correctly.

  • tested with hdf5 and zarr backends with the same results - it was clear from the zarr file that it was not being written to disk (at the root or in general)
  • tested with set, tuple, list, with the same results
  • also tried instantiating the object, then doing nwb.epoch_tags.update({'tag1', 'tag2'})

Steps to Reproduce

import pynwb
import datetime

nwb = pynwb.NWBFile(
    session_id='test',
    session_description='test',
    identifier='12345',
    session_start_time=(
        datetime.datetime.now()
    ),
    epoch_tags={'tag1', 'tag2'},
)
print(f'memory: {nwb.epoch_tags = }')

path = 'test1.nwb'

with pynwb.NWBHDF5IO(path, 'w') as io:
    io.write(
        nwb, link_data=False
    )
    
with pynwb.NWBHDF5IO(path, 'r') as io:
    print(f'disk: {io.read().epoch_tags}')
memory: nwb.epoch_tags = {'tag1', 'tag2'}
disk: set()

Traceback

No response

Operating System

Windows

Python Executable

Python

Python Version

3.11

Package Versions

environment_for_issue.txt

Code of Conduct

bjhardcastle avatar May 03 '24 02:05 bjhardcastle

Thanks for reporting this bug! I was able to replicate that epoch_tags is not getting written to disk. It looks like this might be because the epoch_tags nwbfile attribute is not specified in the schema. We will take a look at this.

As a potential workaround, if you want to access tags that were added to the epochs table, you should be able to access them like so:

import pynwb
import datetime
from pynwb import NWBHDF5IO

nwb = pynwb.NWBFile(
    session_id='test',
    session_description='test',
    identifier='12345',
    session_start_time=(
        datetime.datetime.now()
    ),
)
nwb.add_epoch(start_time=1.0, stop_time=10.0, tags=["tag1", "tag2"])
nwb.add_epoch(start_time=10.0, stop_time=20.0, tags=["tag1", "tag3"])
print(f'memory: {nwb.epoch_tags}')

path = 'test1.nwb'
with pynwb.NWBHDF5IO(path, 'w') as io:
    io.write(nwb, link_data=False)

with pynwb.NWBHDF5IO(path, 'r') as io:
    nwb_read = io.read()
    epoch_tags = set(tag for epoch in nwb_read.epochs[:, "tags"] for tag in epoch)
    print(f'disk: {epoch_tags}')

Which returns:

memory: {'tag1', 'tag3', 'tag2'}
disk: {'tag1', 'tag3', 'tag2'}

Would something like that work for your purposes?

stephprince avatar May 03 '24 22:05 stephprince

Thanks, yes that's how I'd create my sequence of tags for adding to the NWBFile instance anyway. I mainly wanted them to appear correctly in the html repr displayed in a notebook.

If they're not in the schema already, how about just removing the ability to assign them altogether? It does seem redundant...

NWBFile.epoch_tags could just be a property getter that returns the tags on-demand from the epochs table. Then they're always in sync and duplicate information won't be stored to disk.

bjhardcastle avatar May 04 '24 01:05 bjhardcastle