pynapple
pynapple copied to clipboard
Support lazy `TsdFrame`/`TsdTensor`
Currently, this call will load all LFP data into RAM, which is prohibitive for very large datasets:
data = nap.NWBFile(nwb)
data["ElectricalSeriesProbeA-LFP"]
Would it be possible to make a lazy TsdFrame (and probably TsdTensor) representation?
This would also allow to speed up and minimize the memory footprint for the compute_perievent_continuous and other LFP-related processing
Adding comments from #185 here
I was going through TsdTensor and I'm wondering if it supports lazy compute.
Example use case: With calcium imaging data, the outer product of spatial & temporal components can result in a huge array that is hundreds of gigabytes or terrabytes in size. It is almost never necessary to actually compute the entire array, so using a lazy compute data structure works well. The implementation in mescore computes the outer product only when the array is sliced: https://github.com/nel-lab/mesmerize-core/blob/master/mesmerize_core/arrays/_cnmf.py#L131-L141
As discussed:
So far, feeding a numpy memmap to d in TsdTensor seems to work. It's probably possible that any object that implements the numpy array API could work if fed to d. Will test with our LazyArray implementation, as well as other array types (like zarr, dask etc.)
Thanks @gviejo
This seems related but the in memory object is created by nap when accessing the NWB field. So I think that this might need a patch to pass directly the h5py.dataset or the zarr.array to the TsdFrame
Fixed by https://github.com/pynapple-org/pynapple/pull/264