AllenSDK icon indicating copy to clipboard operation
AllenSDK copied to clipboard

Omission times are logged incorrectly by 1 frame

Open farznaj opened this issue 4 years ago • 13 comments

Describe the bug The time assigned to omissions in the log is off by one frame.

To Reproduce & Expected Behavior Compute flash-omission interval, i.e. the interval between the flash immediately preceding an omission and the omission. You will notice that it is ~734ms instead of 750ms; i.e. 1 frame resolution, ~16ms, shorter than expected.

Then, compute omission-flash interval, i.e. the interval between the omission and the subsequent flash. You will notice that it is 767ms; i.e. 1 frame longer than the expected 750ms interval.

Code to compute flash-omission and omission-flash intervals

# Please get the following two variables for an example session, and then run the code snippet below.

list_flashesOmissions # includes the time of all flashes and omissions in the session
list_omitted # includes the time of all omissions in the session
flash_omit_dur_all = []
omit_flash_dur_all = []

for iomit in range(len(list_omitted)):     

    indiv_time = list_omitted.iloc[iomit]
    
    flashes_relative_timing = list_flashesOmissions - indiv_time # timing of all flashes in the session relative to the current omission
    
    indiv_timef_bef = list_flashesOmissions[np.argwhere(flashes_relative_timing==0).squeeze()-1] # Set the time of the flash preceding the omission 

    indiv_timef_aft = list_flashesOmissions[np.argwhere(flashes_relative_timing==0).squeeze()+1] # Set the time of the flash following the omission 

    flash_omit_dur = indiv_time - indiv_timef_bef # sec # this is to check how much flash-omission interval deviates from the expected .75sec

    omit_flash_dur = indiv_time - indiv_timef_aft # sec # this is to check how much omission-flash interval deviates from the expected .75sec

    flash_omit_dur_all.append(flash_omit_dur)
    omit_flash_dur_all.append(omit_flash_dur)

farznaj avatar Apr 30 '20 16:04 farznaj

Hi @farznaj: You mentioned on a separate slack thread that this may not be an SDK issue directly. Could you expand your code block into a fully self-contained (and minimal) working example (including necessary imports)? Please also include the result you get and the result you expect. Once you're done with that, I can take a look at this and try to better understand what's going on.

dougollerenshaw avatar Apr 30 '20 19:04 dougollerenshaw

Hi @dougollerenshaw . All of this information is already provided in the issue I posted. I have explained what values to expect, what values I get; and I have provided the code that gives those values. And I have mentioned that the code needs two variables ... all of this is provided. I did my best to write something complete.... Can you please read it and let me know what is missing. Currently it is all provided, unless I am missing your point.

farznaj avatar May 01 '20 17:05 farznaj

@farznaj, the code block above doesn't run. If we have a minimally functional example to work from, we can ensure that we're looking at the same data.

dougollerenshaw avatar May 01 '20 18:05 dougollerenshaw

Hi Doug. Do you have the following two variables? list_flashesOmissions # includes the time of all flashes and omissions in the session list_omitted # includes the time of all omissions in the session

Since I dont use AllenSDK, I dont know how to get them. I just know how to get them from VB repo. So the code I sent, as I wrote in the comments above, needs to be provided with these 2 variables.

farznaj avatar May 02 '20 03:05 farznaj

@dougollerenshaw Hi Doug. Please see below for example session_ids to reproduce this issue: example scientifica session_id: 805989030 example mesoscope session_id: 876303107

farznaj avatar May 06 '20 18:05 farznaj

@danielsf We see this same error in our stim tables for the vbn_2022_dev branch. Here's code to reproduce:

from allensdk.brain_observatory.ecephys.behavior_ecephys_session import (
    BehaviorEcephysSession)
from pynwb import NWBHDF5IO
import numpy as np

nwb_path = r"\\allen\programs\mindscope\workgroups\np-behavior\vbn_data_release\nwbs_220429\ecephys_session_1044594870.nwb"
with NWBHDF5IO(nwb_path, 'r', load_namespaces=True) as nwb_io:
        session = BehaviorEcephysSession.from_nwb(nwbfile=nwb_io.read())
    
stimulus_presentations = session.stimulus_presentations

inter_flash_interval = stimulus_presentations.loc[stimulus_presentations['active']].start_time.diff()
active = stimulus_presentations.loc[stimulus_presentations['active']]
print('pre-omission interval: ', inter_flash_interval.values[active['omitted'].astype(bool)].mean())
print('post-omission interval: ', inter_flash_interval.values[np.insert(active['omitted'].astype(bool).values, False, 0)[:-1]].mean())
print('standard interval: ', inter_flash_interval.mean())

pre-omission interval: 0.7342562411382509 post-omission interval: 0.7674120710128001 standard interval: 0.7507912380348117

The intervals during the behavior and passive replay blocks (stimulus blocks 0 and 5) should be 0.75 seconds. Looks like the omission is shifted one frame early, causing the pre-omission interval (time from the flash before the omission to the omission) to be 1 frame short, and the post-omission interval (time from omission to next flash) to be one frame long.

The same is true for the omissions in the passive block (stimulus block 5).

corbennett avatar May 10 '22 18:05 corbennett

(ignore my deleted comment; we'll track this work here)

danielsf avatar May 10 '22 18:05 danielsf

Unfortunately, I think we are going to have to fix this in two places. For better or worse, the stimulus table for ecephys experiments is generated by an independent module in ecephys_etl_pipelines.modules.vbn_create_stimulus_table. The SDK just reads the result of that file in and uses it as the Stimuli data object for our BehaviorEcephysSessions.

To fix this problem for VBN, we will need to touch ecephys_etl_pipelines.

To fix the original problem reported by Farzaneh, we will probably have to touch code in the SDK.

Here is Corbett's script rewritten to demonstrate the bug by just reading in the stimulus csv file

import pathlib
import json
import numpy as np

from allensdk.brain_observatory.behavior.data_files.stimulus_file import BehaviorStimulusFile
from allensdk.brain_observatory.behavior.data_objects.stimuli.presentations \
    import \
    Presentations
from allensdk.brain_observatory.behavior.data_objects.stimuli.templates \
    import \
    Templates
from allensdk.brain_observatory.behavior.data_objects.stimuli.stimuli import \
    Stimuli

session_id = 1044594870
json_dir = pathlib.Path('/allen/programs/mindscope/workgroups/np-behavior/vbn_data_release/input_jsons')

json_path = json_dir / f'BEHAVIOR_ECEPHYS_WRITE_NWB_QUEUE_{session_id}_input.json'
assert json_path.is_file()

with open(json_path, 'rb') as in_file:
    json_data = json.load(in_file)
stim_table_file = json_data['session_data']['stim_table_file']
stimulus_file = json_data['session_data']['behavior_stimulus_file']

print(f'stimulus_table_file: {stim_table_file}')
print(f'behavior_stimulus_file: {stimulus_file}')

behavior_stimulus_file = BehaviorStimulusFile(filepath=stimulus_file)

stimuli = Stimuli(
    presentations=Presentations.from_path(
      path=stim_table_file,
      exclude_columns=None
    ),
    templates=Templates.from_stimulus_file(
        stimulus_file=behavior_stimulus_file)
)

stimulus_presentations = stimuli.presentations.value

inter_flash_interval = stimulus_presentations.loc[stimulus_presentations['active']].start_time.diff()
active = stimulus_presentations.loc[stimulus_presentations['active']]
print('pre-omission interval: ', inter_flash_interval.values[active['omitted'].astype(bool)].mean())
print('post-omission interval: ', inter_flash_interval.values[np.insert(active['omitted'].astype(bool).values, 
False, 0)[:-1]].mean())
print('standard interval: ', inter_flash_interval.mean())

Which, as of 5/11/2022, produces the following output

stimulus_table_file: /allen/programs/braintv/production/visualbehavior/prod0/specimen_1023232540/ecephys_session_1044594870/VbnCreateStimTableStrategy/analysis_run_1159864663/1044594870_VBN_stimulus_table.csv
behavior_stimulus_file: /allen/programs/braintv/production/visualbehavior/prod0/specimen_1023232540/ecephys_session_1044594870/1044791091/1044594870_524761_20200820.behavior.pkl
pre-omission interval:  0.734256241138249
post-omission interval:  0.7674120710128252
standard interval:  0.7507912380348116

I will open a new ticket to perform the fix in ecephys_etl_pipelines and link it to this ticket so that we can propagate what we learn to the SDK.

danielsf avatar May 11 '22 16:05 danielsf

I believe we have tracked down the issue. The discussion is documented here

https://github.com/AllenInstitute/ecephys_etl_pipelines/issues/39

and the fix is here

https://github.com/AllenInstitute/ecephys_etl_pipelines/pull/40

@matchings do you agree with our assessment that that += 1 should be dropped from epoch_start and epoch_end in the visual stimulus processing code.

Also: I don't know if you want to incorporate what we've learned into visual_behavior_analysis or not. I'm just calling that question out here.

danielsf avatar May 19 '22 22:05 danielsf

@matchings ^^ ?

aamster avatar Jul 18 '22 23:07 aamster

(I know I'm not supposed to be working now, but....)

@matchings and I talked about this off line. The decision was to implement this fix as long as it didn't change the results users get from downloading the released NWB files.

This branch

https://github.com/AllenInstitute/AllenSDK/pull/2475

should implement the fix. I verified for myself that it doesn't alter the behavior when loading NWB files, but a triple-check couldn't hurt. I didn't get a chance to actually submit this for review before going on leave.

danielsf avatar Jul 19 '22 00:07 danielsf

@aamster @danielsf I am good with this, but given the importance of timing in these experiments, i'd like to get a second opinion -

@alexpiet @yavorska-iryna @farznaj as key stakeholders in the Visual Behavior dataset, are you ok with changing the SDK output to fix this off by one error in omission start times?? As it is now, the data that users see via NWB files will not be changed. However it is my understanding that if NWB files need to be regenerated for any reason, this fix would be applied and the omission start times would be 1 frame different than before. This seems negligible to me given the frame rate of ophys, but I want to be sure that there are no major concerns before fully signing off on it.

matchings avatar Jul 19 '22 02:07 matchings

I’m ok with that. Thank you.

On Mon, Jul 18, 2022 at 7:07 PM Marina @.***> wrote:

@aamster https://github.com/aamster @danielsf https://github.com/danielsf I am good with this, but given the importance of timing in these experiments, i'd like to get a second opinion

@alexpiet https://github.com/alexpiet @yavorska-iryna https://github.com/yavorska-iryna @farznaj https://github.com/farznaj as key stakeholders in the Visual Behavior dataset, are you ok with changing the SDK output to fix this off by one error in omission start times?? As it is now, the data that users see via NWB files will not be changed. However it is my understanding that if NWB files need to be regenerated for any reason, this fix would be applied and the omission start times would be 1 frame different than before. This seems negligible to me given the frame rate of ophys, but I want to be sure that there are no major concerns before fully signing off on it.

— Reply to this email directly, view it on GitHub https://github.com/AllenInstitute/AllenSDK/issues/1539#issuecomment-1188515163, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTJ7QUE44GQUYLQLJQ7UA3VUYEVTANCNFSM4MVXUX4A . You are receiving this because you were mentioned.Message ID: @.***>

farznaj avatar Jul 19 '22 02:07 farznaj