BUG: sbx_to_binary computes nframes incorrectly with multiple channels if longer than batch_size frames
Describe the issue:
There seems to be a bug in the way frames are counted when loading one or more Scanbox files. From sbx.py:
for ichunk, onset in enumerate(iblocks[:-1]):
offset = iblocks[ichunk + 1]
im = np.array(f[onset:offset, :, :, ndeadrows:, ndeadcols:]) // 2
im = im.astype(np.int16)
im2mean = im.mean(axis=0).astype(np.float32) / len(iblocks)
for ichan in range(nchannels):
nframes = im.shape[0]
im2write = im[:, :, ichan, :, :]
for j in range(0, nplanes):
if iall == 0:
ops1[j]["meanImg"] = np.zeros((im.shape[3], im.shape[4]),
np.float32)
if nchannels > 1:
ops1[j]["meanImg_chan2"] = np.zeros(
(im.shape[3], im.shape[4]), np.float32)
ops1[j]["nframes"] = 0
if ichan == nfunc:
ops1[j]["meanImg"] += np.squeeze(im2mean[j, ichan, :, :])
reg_file[j].write(
bytearray(im2write[:, j, :, :].astype("int16")))
else:
ops1[j]["meanImg_chan2"] += np.squeeze(im2mean[j, ichan, :, :])
reg_file_chan2[j].write(
bytearray(im2write[:, j, :, :].astype("int16")))
ops1[j]["nframes"] += im2write.shape[0]
ops1[j]["nframes_per_folder"][ifile] += im2write.shape[0]
ik += nframes
iall += nframes
Note that ops1[j]["nframes"] (which becomes ops['nframes']) is updated with the number of frames in the current chunk once per channel, whereas it should be independent of the number of channels. The exception is for the first chunk, where iall is still 0 for each iteration of the ichan loop, so ops1[j]["nframes"] gets reset to 0 for each iteration of this inner loop. Thus, there is only a bug when the total number of frames is greater than the chunk size, which defaults to 500.
Please use the files here to replicate with the code example: https://drive.google.com/drive/folders/1xJ6zG52rMNEKHAN5856KW8PQeeWjxFAB?usp=drive_link
Reproduce the code example:
# pwd should be a writable folder that contains 508_000_000.mat and 508_000_000.sbx
# this is a 1-plane, 2-channel, 1000-frame, 512 x 796 pixel dataset.
import suite2p
import os
ops = suite2p.default_ops()
try:
os.mkdir('suite2p')
except FileExistsError:
pass
ops['data_path'] = ['.']
ops['save_path0'] = '.'
ops['save_folder'] = 'suite2p'
ops['fast_disk'] = 'suite2p'
ops['nplanes'] = 1
ops['nchannels'] = 2
ops['input_format'] = 'sbx'
ops['do_registration'] = False
ops['roidetect'] = False
ops['neuropil_extract'] = False
ops['spikedetect'] = False
ops_out = suite2p.run_s2p(ops=ops, db={})
real_frames = 1000
assert ops_out['nframes'] == real_frames, f'Expected {real_frames} frames, got {ops_out["nframes"]}'
Error message:
Traceback (most recent call last):
File "/synology/eblackwood/VR_learning/2p_testing/508/test_folder/test_bug.py", line 27, in <module>
assert ops_out['nframes'] == real_frames, f'Expected {real_frames} frames, got {ops_out["nframes"]}'
AssertionError: Expected 1000 frames, got 1500
Version information:
suite2p v0.14.4.dev32+gf500c1a.d20240311
Context for the issue:
This doesn't cause an error message when running the pipeline, but it may cause some of the processing steps to give incorrect results (haven't looked into it). It caused an error in a script I wrote for manual cropping when I tried to use nframes from the returned ops to determine how large of a BinaryFile I needed.
Looking a little closer, I think there is at least one other issue with this code block. Initializing "meanImg" to 0 in the if iall == 0 block happens for the 2nd channel as well, meaning that the first 500 frames of the first channel get lost.
I think something like the following would fix it, happy to make a pull request if it makes sense:
for ichunk, onset in enumerate(iblocks[:-1]):
offset = iblocks[ichunk + 1]
im = np.array(f[onset:offset, :, :, ndeadrows:, ndeadcols:]) // 2
im = im.astype(np.int16)
im2mean = im.mean(axis=0).astype(np.float32) / len(iblocks)
for ichan in range(nchannels):
nframes = im.shape[0]
im2write = im[:, :, ichan, :, :]
for j in range(0, nplanes):
if ichan == nfunc:
if iall == 0:
ops1[j]["meanImg"] = np.zeros((im.shape[3], im.shape[4]), np.float32)
ops1[j]["nframes"] = 0
ops1[j]["meanImg"] += np.squeeze(im2mean[j, ichan, :,
ops1[j]["nframes"] += im2write.shape[0]
ops1[j]["nframes_per_folder"][ifile] += im2write.shape[0]
reg_file[j].write(bytearray(im2write[:, j, :, :].astype("int16")))
else:
if iall == 0:
ops1[j]["meanImg_chan2"] = np.zeros(
(im.shape[3], im.shape[4]), np.float32)
ops1[j]["meanImg_chan2"] += np.squeeze(im2mean[j, ichan, :, :])
reg_file_chan2[j].write(
bytearray(im2write[:, j, :, :].astype("int16")))
iall += nframes
I have the same issue.