bids-tools
bids-tools copied to clipboard
CTF's <blabla>.infods file contains information about the date and time of data acquisition
I came across this while removing the date and timestamps from the *.res4 files of data that Kristijan and I are going to share with the rest of the world. I was surprised that ft_read_header still managed to contain a date that referred to the date of acquisition. hdr.orig.res4 is nicely stripped of course after running ctf_remove_datetime, but hdr.orig.infods still has this info available.
To reproduce:
cd /project/3011020.13/bids/sub-V1020/meg/sub-V1020_task-visual_meg.ds x=ft_read_header('sub-V1020_task-visual_meg.ds'); x.orig.infods(41)
alternatively, you can also read the 'sub-V1020_task-visual_meg.infods text file...
I would assume that the .infods files are usually not removed from the data before sharing, so I would be game to write some voodoo shell script that can butcher the .infods file.
should we continue with these scripts, or use https://www.fieldtriptoolbox.org/faq/how_can_i_anonymize_a_ctf_dataset/#using-matlab
I recently used the latter for a Parkinson MEG dataset.
It is this dataset https://data.donders.ru.nl/collections/di/dccn/DSC_3018009.04_857. I checked one infods
and that looked like
WS1_
_PATIENT_INFO WS1_ _PATIENT_UID
_PATIENT_NAME_FIRST
_PATIENT_NAME_MIDDLE
_PATIENT_NAME_LAST
_PATIENT_ID
NOT FOR CLINICAL USE _PATIENT_BIRTHDATE
_PATIENT_SEX _PATIENT_PACS_NAME
_PATIENT_PACS_UID
_PATIENT_INSTITUTE
NOT FOR CLINICAL USE EndOfParameters _PROCEDURE_INFO WS1_ _PROCEDURE_VERSION _PROCEDURE_UID
_PROCEDURE_ACCESSIONNUMBER
_PROCEDURE_TITLE
_PROCEDURE_SITE
_PROCEDURE_STATUS _PROCEDURE_TYPE _PROCEDURE_STARTEDDATETIME
_PROCEDURE_CLOSEDDATETIME
_PROCEDURE_COMMENTS
writeCTFds NOT FOR CLINICAL USE _PROCEDURE_LOCATION
_PROCEDURE_ISINDB EndOfParameters
_DATASET_INFO WS1_ _DATASET_VERSION _DATASET_UID
_DATASET_PATIENTUID
_DATASET_PROCEDUREUID
_DATASET_STATUS
writeCTFds NOT FOR CLINICAL USE _DATASET_RPFILE
_DATASET_PROCSTEPTITLE
run title NOT FOR CLINICAL USE _DATASET_PROCSTEPPROTOCOL
_DATASET_PROCSTEPDESCRIPTION
_DATASET_COLLECTIONDATETIME
_DATASET_COLLECTIONSOFTWARE
writeCTFds _DATASET_CREATORDATETIME
20210409140701 _DATASET_CREATORSOFTWARE
writeCTFds _DATASET_KEYWORDS
_DATASET_COMMENTS
NOT FOR CLINICAL USE _DATASET_OPERATORNAME
_DATASET_LASTMODIFIEDDATETIME
20210409140701 _DATASET_NOMINALHCPOSITIONS _DATASET_COEFSFILENAME
_DATASET_SENSORSFILENAME
_DATASET_SYSTEM
_DATASET_SYSTEMTYPE
_DATASET_LOWERBANDWIDTH _DATASET_UPPERBANDWIDTH @r¿
I think that using the referenced strategy is better. Yet, for my current use case it feels a bit as an overkill, because it requires a full copy of the data to be created (i.e. no in place update of the descriptors seems possible, unless the code is hacked).
Also, from what I read in writeCTFds.m it seems as if the hz.ds/hz2.ds are not included in the output. (although I am not sure whether this would be a problem).
Also, it seems that the *.acq files also may contain run_date and run_time.
writeCTFds uses writeCPersist to write the acq and infods files. This is a separate function (i.e. no subfunction from writeCTFds), so it should be possible to overwrite these metadata files, without the need of rewriting the whole data directory.
OK, I have written a prototype function (inspired by the function that @robertoostenveld referred to above) that rewrites the files in the *.ds dir that contain dates and times, i.e. the res4, acq and infods. This without the need of creating a full copy of the binary data as well. Currently, my prototype function moves the originals into *.res4_old etc, but if we are sure that it works fine, I think that the originals can be overwritten. Would it be an idea to use this code to refresh the referenced website, and or consider to make this part of the standard bidsification procedure in data2bids?
@KristijanArmeni I will soon do a full sweep of the Sherlock data to scrub it from date and time (and operator :) )