pybids icon indicating copy to clipboard operation
pybids copied to clipboard

No datatype entity for ieeg when using BIDSLayoutIndexer

Open laemtl opened this issue 3 years ago • 7 comments

bids_layout = BIDSLayout(root=self.bids_dir, config=bids_config, ignore=exclude_arr) ... files = self.bids_layout.get(subject=subject, session=visit) for file in files: print(file.entities)

{'acquisition': 'seeg', 'datatype': 'ieeg', 'extension': '.tsv', 'session': 'V01', 'subject': 'PID', 'suffix': 'channels', 'task': 'task1'}


bids_layout = BIDSLayout( root=self.bids_dir, indexer=BIDSLayoutIndexer(config_filename=bids_config, ignore=exclude_arr) ) ... files = self.bids_layout.get(subject=subject, session=visit) for file in files: print(file.entities)

{'acquisition': 'seeg', 'extension': 'tsv', 'session': 'V01', 'subject': 'PID', 'suffix': 'channels', 'task': 'task1'}

Switching to use BIDSLayoutIndexer remove datatype from the list of entities for ieeg recordings.

laemtl avatar Jul 30 '21 04:07 laemtl

I am running into a similar issue with a dataset.

loading data with:

bids_layout = BIDSLayout(root=bids_dir, config=bids_config, ignore=['/code/', '/sourcedata/', '/log/', '.git/'], force_index=[re.compile("_annotations\.(tsv|json)$")], derivatives=True)

FYI: bids_config points to the following file: https://github.com/aces/Loris-MRI/blob/main/python/lib/bids.json

When I run parse_file_entities on an iEEG file, I get the following:

bids_layout.parse_file_entities("test_annotation/derivatives/sub-PID3/ses-V01/ieeg/sub-PID_ses-V01_task-01_ieeg.edf")
{'subject': 'PID3', 'session': 'V01', 'task': '01', 'suffix': 'ieeg', 'datatype': 'ieeg', 'extension': 'edf'}

However, when I run the follow statement, no EDF file is returned:

>>> bids_layout.get(subject='PID3', session='V01', extension=['set', 'edf', 'vhdr', 'vmrk', 'eeg', 'bdf'], datatype='ieeg')
[]

If I remove the datatype restriction from the above command, I do get the EDF file listed:

bids_layout.get(subject='PID3', session='V01', extension=['set', 'edf', 'vhdr', 'vmrk', 'eeg', 'bdf'])
[<BIDSFile filename='/Users/cmadjar/Data/LORIS_imaging_data/incoming/test_annotation/derivatives/sub-PID3/ses-V01/ieeg/sub-PID_ses-V01_task-01_ieeg.edf'>]

It looks like for some reason the datatype restriction of the get command is not working for that file. Any reason why that could be the case?

Thank you!

cmadjar avatar Sep 09 '21 19:09 cmadjar

Okay, I think I see two issues here. First is that your custom configs don't seem to be being respected when it's time to parse entities. Need to look into that some more.

The second, easier one, is that, while the datatype in bids.json was updated, it was not in bids-nodot.json:

https://github.com/bids-standard/pybids/blob/8ee63bda6244d10aabc97841d2055960af89b373/bids/layout/config/bids.json#L89-L91

https://github.com/bids-standard/pybids/blob/8ee63bda6244d10aabc97841d2055960af89b373/bids/layout/config/bids-nodot.json#L88-L91

bids.json is what we're moving to, bids-nodot.json exists to smooth the transition to having the parsed extensions include the initial dot. If you upgrade to 0.14.0rc1, you can fix this for free. Alternately, we can update the regex in 0.13.x and make a patch release.

effigies avatar Sep 16 '21 18:09 effigies

@effigies Thank you for the quick answer!! I just tried version 0.14.0rc1 and it did not appear to be resolving the issue. Actually, with version 0.14.0rc1 I do not see anything returned even when I do not specify the datatype in the query.

Example:

>>> bids_layout.parse_file_entities('test_annotation/derivatives/sub-PID3/ses-V01/ieeg/sub-PID_ses-V01_task-01_ieeg.edf')
{'subject': 'PID3', 'session': 'V01', 'task': '01', 'suffix': 'ieeg', 'datatype': 'ieeg', 'extension': 'edf'}

but cannot find the EDF file in the get query:

>>> bids_layout.get(extension=['set', 'edf', 'vhdr', 'vmrk', 'eeg', 'bdf'])
[]

Could this be due to the first issue you mentioned in your reply? Let me know if there is additional information I can provide to help resolve the issue.

Thank you!

cmadjar avatar Sep 17 '21 13:09 cmadjar

Sorry for dropping this one.

@cmadjar Are you able to share the dataset? It's going to be hard to diagnose, but my guess is that the derivatives just aren't getting indexed. Might be because the derivatives are directly in derivatives/. You could try derivatives=f"{bids_dir}/derivatives" to be explicit about what you want read in.

effigies avatar Oct 19 '21 21:10 effigies

@effigies looks like we could share the dataset we are testing. How can we send it to you?

cmadjar avatar Oct 20 '21 14:10 cmadjar

You can send it to this username @ gmail.com. Any sharing service that will allow me to fetch reasonably simply will work.

effigies avatar Oct 20 '21 15:10 effigies

I was able to replicate the problem with the following dataset: https://github.com/bids-standard/bids-examples/tree/master/ieeg_visual

Using pybids==0.13.2: [1]

bids_layout = BIDSLayout( 
  root=self.bids_dir,
  indexer=BIDSLayoutIndexer(
    config_filename=bids_config,
    ignore=exclude_arr,
    force_index=force_arr
  )
)
 
files = bids_layout.get(subject='01', session='01')
for file in files:
  print(file.entities)
{'ECGChannelCount': 0, 'ECOGChannelCount': 118, 'EEGChannelCount': 0, 'EMGChannelCount': 0, 'EOGChannelCount': 0, 'ElectrodeManufacturer': 'AdTech', 'EpochLength': 0, 'HardwareFilters': {'HighpassFilter': {'CutoffFrequency': 0.5}, 'LowpassFilter': {'CutoffFrequency': 300}}, 'InstitutionAddress': '300 Pasteur Dr, Stanford, CA 94305', 'InstitutionName': 'Stanford Hospital and Clinics', 'Instructions': 'look at the dot in the center of the screen and press the button when it changes color', 'Manufacturer': 'Tucker Davis Technologies', 'MiscChannelCount': 0, 'PowerLineFrequency': 60, 'RecordingDuration': 233.639, 'RecordingType': 'continuous', 'SEEGChannelCount': 0, 'SamplingFrequency': 3051.76, 'SoftwareFilters': 'n/a', 'TaskDescription': 'visual gratings and noise patterns', 'TaskName': 'visual', 'TriggerChannelCount': 0, 'extension': 'eeg', 'iEEGPlacementScheme': 'right occipital temporal surface', 'iEEGReference': 'intracranial channel not included with data', 'run': 1, 'session': '01', 'subject': '01', 'suffix': 'ieeg', 'task': 'visual'}

Using pybids==0.12.0:

[2]

bids_layout = BIDSLayout(
  root=self.bids_dir, 
  config=bids_config, 
  ignore=exclude_arr, 
  force_index=force_arr
)

files = bids_layout.get(subject='01', session='01')
for file in files:
  print(file.entities)
{'ECGChannelCount': 0, 'ECOGChannelCount': 118, 'EEGChannelCount': 0, 'EMGChannelCount': 0, 'EOGChannelCount': 0, 'ElectrodeManufacturer': 'AdTech', 'EpochLength': 0, 'HardwareFilters': {'HighpassFilter': {'CutoffFrequency': 0.5}, 'LowpassFilter': {'CutoffFrequency': 300}}, 'InstitutionAddress': '300 Pasteur Dr, Stanford, CA 94305', 'InstitutionName': 'Stanford Hospital and Clinics', 'Instructions': 'look at the dot in the center of the screen and press the button when it changes color', 'Manufacturer': 'Tucker Davis Technologies', 'MiscChannelCount': 0, 'PowerLineFrequency': 60, 'RecordingDuration': 233.639, 'RecordingType': 'continuous', 'SEEGChannelCount': 0, 'SamplingFrequency': 3051.76, 'SoftwareFilters': 'n/a', 'TaskDescription': 'visual gratings and noise patterns', 'TaskName': 'visual', 'TriggerChannelCount': 0, 'datatype': 'ieeg', 'extension': 'eeg', 'iEEGPlacementScheme': 'right occipital temporal surface', 'iEEGReference': 'intracranial channel not included with data', 'run': 1, 'session': '01', 'subject': '01', 'suffix': 'ieeg', 'task': 'visual'}

the datatype entity is available for the ieeg files, but not with pybids>=0.12.1 and the code sample from above ([1]).

laemtl avatar Oct 20 '21 16:10 laemtl