NWBFile cannot be passed as file argument to NwbRecordingExtractor
Hi, everyone! :)
I am interested in opening a NWB file containing my acquisition traces with SpikeInterface. However, it seems that the read_file_from_backend function called during the NwbRecordingExtractor object initialization always opens the NWB recording in mode="r" (hardcoded), disabling any chance to edit the NWB file itself. Since no mode parameter is exposed by NwbRecordingExtractor, I tried to use the file argument instead of file_path, which should be able to accept an "in-memory representation of the NWB file" (according to the docstring). However, when I try to pass a pynwb.file.NWBFile object read via pynwb library and correctly opened with mode="r+", the initialization fails, as the NWBFile instance is fed as input to h5py.File(), regardless of the use_pynwb parameter.
I am not sure this is the intended behaviour and if yes, then I believe it would be important to enable the opening of NWB files in modes that enable writing. In my case, I read raw acquisitions, then I process them with SpikeInterface (e.g., LFP band isolation, downsampling, ...) and I would like to save the new time series back in the original NWB file. I am currently able to do so, but it involves:
- Saving the processed traces in a temporary file on disk
- Close and delete the SpikeInterface recording object
recording._nwbfile.get_read_io().close() recording._nwbfile.get_read_io()._file.close() del recording - Load the temporary file with SpikeInterface as a new recording object
- Reopen the same NWB file with pynwb in
r+mode - Write the traces stored on disk in the NWB file
- Delete the temporary saving file
As you can imagine, the whole processes is quite tedious. Thank you in advance for your help and support!
Hi @FrancescoNegri
The hardcoded mode='r' is indeed by design. The extractor should only be able to read, not write traces. I actually have faced the same issue before though...so we might want to be less stringent about it and just warn the user if opening in r+.
This way, one could use spikeinterface to open and do some preprocessing (e.g., extract some LFP and decimate) and then use neuroconv to write the new objects to the same file.
@h-mayorquin what do you think?
Yes, that makes sense.
@FrancescoNegri can you describe how where you trying to save or the workflow that you would like to enable in more detail. As in, at the moment, how did you try to save the data. Just a full pseudo-code script of the workflow that you think would work best for you needs.
Current workaroud
# Load raw data from NWB file
recording = read_nwb_recording(file_path=file_path, electrical_series_path=..., use_pynwb=True)
# Process the data (e.g., filter LFP band)
recording_LFP = spre.filter(recording, **LFP_filter_kwargs)
recording_LFP = spre.common_reference(recording_LFP, ...)
recording_LFP = spre.resample(recording_LFP, resample_rate=1000)
# Save the processed signal in a temp file
temp_save_path = Path("/tmp").joinpath(str(uuid.uuid4())).with_suffix("")
recording_LFP = recording_LFP.save(format="binary", folder=temp_save_path)
# Close the NWB file and delete the original recording object
recording._nwbfile.get_read_io().close()
recording._nwbfile.get_read_io()._file.close()
del recording
# Open the same NWB file in editable mode
io = NWBHDF5IO(file_path, "r+")
nwbfile = io.read()
# Add the processed signal to the NWB file
lfp = pynwb.ecephys.ElectricalSeries(
...,
data=SpikeInterfaceRecordingDataChunkIterator(
recording=recording_LFP,
...
),
...
)
ecephys_module = nwbfile.processing["ecephys"] # if ecephys module exists, otherwise create it
ecephys_module.add(pynwb.ecephys.FilteredEphys(name="LFP", electrical_series=lfp)
nwbfile.get_read_io().write(nwbfile)
# Close the NWB file
nwbfile.get_read_io().close()
nwbfile.get_read_io()._file.close()
del nwbfile
Desirable solution 1
# Load raw data from NWB file in editable mode
recording = read_nwb_recording(file_path=file_path, electrical_series_path=..., use_pynwb=True, mode="r+")
# Process the data (e.g., filter LFP band)
recording_LFP = ...
# Add the processed signal to the NWB file
lfp = pynwb.ecephys.ElectricalSeries(
...,
data=SpikeInterfaceRecordingDataChunkIterator(
recording=recording_LFP,
...
),
...
)
ecephys_module = recording._nwbfile.processing["ecephys"] # if ecephys module exists, otherwise create it
ecephys_module.add(pynwb.ecephys.FilteredEphys(name="LFP", electrical_series=lfp)
recording._nwbfile.get_read_io().write(recording._nwbfile)
# Close the NWB file and delete the recording object
recording._nwbfile.get_read_io().close()
recording._nwbfile.get_read_io()._file.close()
del recording
Desirable solution 2
# Open the NWB file in editable mode
io = NWBHDF5IO(file_path, "r+")
nwbfile = io.read()
# Load raw data from NWB file in editable mode (already opened file argument instead of file_path)
recording = read_nwb_recording(file=nwbfile, electrical_series_path=..., use_pynwb=True)
# Process the data (e.g., filter LFP band)
recording_LFP = ...
# Add the processed signal to the NWB file
lfp = pynwb.ecephys.ElectricalSeries(
...,
data=SpikeInterfaceRecordingDataChunkIterator(
recording=recording_LFP,
...
),
...
)
ecephys_module = nwbfile.processing["ecephys"] # if ecephys module exists, otherwise create it
ecephys_module.add(pynwb.ecephys.FilteredEphys(name="LFP", electrical_series=lfp)
nwbfile.get_read_io().write(nwbfile)
# Close the NWB file and delete the recording object
recording.nwbfile.get_read_io().close()
recording.nwbfile.get_read_io()._file.close()
del recording, nwbfile
Regarding the solutions I suggested, it would be crucial to address a correct handling of the
nwbfileinstance when the associate recording object is deleted or overwritten, to avoid orphan open file handles that could lock the file and prevent its opening in the right mode or its editing at all.
Thanks for describing this in detail. My initial reaction is that both workflows should be supported but that the desirable solution 1 might be the low hanging fruit. Let me think on the implementation and come back to you.
Sure, take your time and thank you for the reply :)