pynwb icon indicating copy to clipboard operation
pynwb copied to clipboard

python validation edge-case: empty TimeSeries

Open bendichter opened this issue 8 months ago • 4 comments

I am reviewing an NWB file submitted by BICAN currently on DANDI:

sub-1265174261_ses-1265661925_icephys.nwb.zip

https://dandiarchive.org/dandiset/001351/0.250401.1555/files?location=sub-1265174261&page=1

It has a peculiar feature: the /data and /timestamp arrays are empty for some of the TimeSeries objects in /processing/spikes/*, e.g. /processing/spikes/Sweep_20/data. This is causing an error in nwbinspector because it assumes that if a timestamps array exists then it is non-empty. See issue here: https://github.com/NeurodataWithoutBorders/nwbinspector/issues/582

Do we want to allow this?

bendichter avatar Apr 14 '25 16:04 bendichter

@rly I'd love to hear your thoughts on this

bendichter avatar Apr 14 '25 17:04 bendichter

Image

Data and timestamps both have shape (0, ). In the spikes TimeSeries with data, data == timestamps. These appear to be times of detected spike events for each sweep. An empty TimeSeries suggests no spikes were detected on the corresponding sweeps. Also, some sweeps do not have a corresponding spikes TimeSeries. Would it be all right for the user (and software) to not write an empty spikes TimeSeries?

Does a spikes TimeSeries with empty data and timestamps mean spike detection was run on the data and there were no results, or spike detection was not run on the data? Is it useful to distinguish the two cases? @t-b I would be curious to get your input. This file was likely written using IGOR because it uses ndx-mies and the general metadata has MIES information.

If the lack of a TimeSeries is as informative as an empty TimeSeries, then I think an empty TimeSeries has no value and should not be allowed.

(Also, once ndx-events is merged into core NWB, ideally these detected spike times would be stored as events, not time series.)

rly avatar Apr 15 '25 18:04 rly

@rly Thanks for pinging me. The original file was written using MIES but processing/spikes is not from MIES, but some downstream software package. I'll try to find someone which knows more.

My guess is that the creator wanted to distinguish "no matches" from "forgot to look at that sweep".

t-b avatar Apr 15 '25 21:04 t-b

Yes, I think it's important to understand the use-case, and we also should consider the broader implications. We should consider this question for all Datasets of any Neurodata Type.

bendichter avatar Apr 15 '25 21:04 bendichter