mne-bids
mne-bids copied to clipboard
BIDS anonymize dataset unwanted behavior for split.
Describe the problem
As a disclaimer, note that I am using mne_bids version 0.10, BIDS Version 1.6.0 and this issue might have been solve with future version even though I haven't seen any one reporting this before.
I am currently anonymizing a BIDS dataset and I found this very nice function to be extremely handy. https://mne.tools/mne-bids/stable/generated/mne_bids.anonymize_dataset.html
Unfortunately, it does not handle the split files correctly. Assuming that you have a .fif files that needs to be splitted (above 2GB) , then mne split this file automatically. Which is great. The problem occurs when you already have a split file, and this function recreate a split again.
For example this was my original filename:
sub-001_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-01_meg.fif sub-001_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-02_meg.fif
After running the anonymization:
sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-01-split-01_meg.fif sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-01-split-02_meg.fif sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-02_meg.fif (which is just a copy of the actual split)
I tried to solve this manually by actually renaming the file and removing the extra split that was created. After doing so, I also corrected the error that spread over the scans.tsv aswell. At the end, I finally had a dataset that passed the BIDS validator. But unfortunately by renaming like this I created a bigger problem because split files should be renamed by loading and re-saving with MNE-Python to preserve proper filename linkage
I assume this is not a wanted behavior from this function, we probably want the following as an output:
sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-01_meg.fif sub-2IhVOz_ses-PeriOp_task-HoldR_acq-MedOff_run-1_split-02_meg.fif
Describe your solution
I haven't digged much inside the function itself but I guess the main idea to solve it would be to handle split files like the following:
# get all the fif file of the current subject being anonymized
meg_files = BIDSPath(root=root, subject=subject, extension='fif').match()
# get the split1 and split2
split1 = [f for f in meg_files if 'split-01' in f.basename]
split2 = [f for f in meg_files if 'split-02' in f.basename]
if len split1 > 0 and split2 > 0:
for s1, s2 in zip(split1, split2):
raw = read_raw_bids(s1) # read the raw file that we will save
...
I think this bug result allegedly because the bids basename still contains the entity split01 (this is why when it creates an anonymized copy it create an additional split01 in the basename).
so one idea that come to me is to update the filename without the split just by doing
s1_basename_anon = s1.copy().update(split=None)
And here instead of also saving the second split, we could just save the raw s1_basename, which will create the new split immediately with the correct linkage.
raw.save(s1_basename_anon, split_naming='bids'
Describe possible alternatives
Implementing or fixing this (if it has not been already) would be really helpfull and time winning for future bids dataset that will be anonymized. Thank you
Additional context
No response
Hello! 👋 Thanks for opening your first issue here! ❤️ We will try to get back to you soon. 🚴🏽♂️
As a disclaimer, note that I am using mne_bids version 0.10, BIDS Version 1.6.0 and this issue might have been solve with future version even though I haven't seen any one reporting this before.
Could you please try this with the most recent stable version of mne_bids (0.14) and see if it works?
Could you please try this with the most recent stable version of mne_bids (0.14) and see if it works?
I tried again using the most recent version of mne_bids 0.14 after upgrading using
pip install --upgrade mne-bids
but the unwanted behavior remain the same as I explained in my previous message.
Here is an image of my original folder:
and the anonymized output:
We recently fixed a similar issue in MNE-BIDS-Pipeline
https://github.com/mne-tools/mne-bids-pipeline/pull/855