`BlackrockRawIO` is over-segmenting data
I have user data that, according to them, has no true segments at all. They did not stop and restart the recording session. Nevertheless, the current BlackrockRawIO implementation creates thousands of segments.
The reason is the following logic in the code:
https://github.com/NeuralEnsemble/python-neo/blob/3b592a8509c6d76a205cae9d6198190782b4881c/neo/rawio/blackrockrawio.py#L1066-L1072
Here, if the difference in timestamps is larger than twice the expected difference based on the sampling rate of the stream, a new segment is created. For the user’s ns4 and ns6 files, these thresholds are as small as 0.2 ms and 0.067 ms, which is overly strict. This causes recordings with tiny millisecond gaps (Which I think are buffer and/or jitter artifacts, not real breaks) to be split into even though the user insists the recording was continuous.
The Matlab NPMK implementation lets users control this threshold with the max_tick_multiple parameter:
https://github.com/BlackrockNeurotech/NPMK/blob/a5b3e3b25b6e2f4594ecbb99d3e0e5e517530959/NPMK/openNSx.m#L182-L186
My suggestion is to improve this in three ways:
- Provide warnings when gaps are found, including their size and type, similar to how Intan handling works. See https://github.com/NeuralEnsemble/python-neo/pull/1769
- Add a parameter (e.g.
segmentation_threshold``segment_threshold_s) to the constructor so users can control how large a gap must be before a new segment is created. - Give users direct access to the raw timestamps so they can analyze the gaps themselves, perform custom interpolation or drift correction, and align with other systems.