python validation edge-case: empty TimeSeries
I am reviewing an NWB file submitted by BICAN currently on DANDI:
sub-1265174261_ses-1265661925_icephys.nwb.zip
https://dandiarchive.org/dandiset/001351/0.250401.1555/files?location=sub-1265174261&page=1
It has a peculiar feature: the /data and /timestamp arrays are empty for some of the TimeSeries objects in /processing/spikes/*, e.g. /processing/spikes/Sweep_20/data. This is causing an error in nwbinspector because it assumes that if a timestamps array exists then it is non-empty. See issue here: https://github.com/NeurodataWithoutBorders/nwbinspector/issues/582
Do we want to allow this?
@rly I'd love to hear your thoughts on this
Data and timestamps both have shape (0, ). In the spikes TimeSeries with data, data == timestamps. These appear to be times of detected spike events for each sweep. An empty TimeSeries suggests no spikes were detected on the corresponding sweeps. Also, some sweeps do not have a corresponding spikes TimeSeries. Would it be all right for the user (and software) to not write an empty spikes TimeSeries?
Does a spikes TimeSeries with empty data and timestamps mean spike detection was run on the data and there were no results, or spike detection was not run on the data? Is it useful to distinguish the two cases? @t-b I would be curious to get your input. This file was likely written using IGOR because it uses ndx-mies and the general metadata has MIES information.
If the lack of a TimeSeries is as informative as an empty TimeSeries, then I think an empty TimeSeries has no value and should not be allowed.
(Also, once ndx-events is merged into core NWB, ideally these detected spike times would be stored as events, not time series.)
@rly Thanks for pinging me. The original file was written using MIES but processing/spikes is not from MIES, but some downstream software package. I'll try to find someone which knows more.
My guess is that the creator wanted to distinguish "no matches" from "forgot to look at that sweep".
Yes, I think it's important to understand the use-case, and we also should consider the broader implications. We should consider this question for all Datasets of any Neurodata Type.