bids-specification icon indicating copy to clipboard operation
bids-specification copied to clipboard

[ENH] add _proc-<label> to all modalities having _rec (anat, fmap, func, perf, and pet)

Open yarikoptic opened this issue 5 years ago • 44 comments

Initially I thought also to introduce it to behavioral data as well, since IMHO it potentially could be applicable whenever hardware might already be producing multiple files, e.g. where physio signals were minimally preprocessed (e.g. removing MR pulse effects etc). If anyone has an existing use case already - let me know or just push an additional commit for that portion.

Closes: #65

yarikoptic avatar Dec 12 '18 18:12 yarikoptic

I personally feel uneasy adding this new keyword to any data type it could possibly apply to when there is a use case only for structural scans. Without concrete use case for other data types it is hard to evaluate all angles and thus there is a risk of adding something that is suboptimal but will need to be kept for backward compatibility.

I would recommend being conservative and sticking only to enhancements driven by real-world use cases.

chrisgorgo avatar Dec 12 '18 20:12 chrisgorgo

I personally feel uneasy adding this new keyword to any data type it could possibly apply to when there is a use case only for structural scans. Without concrete use case for other data types it is hard to evaluate all angles and thus there is a risk of adding something that is suboptimal but will need to be kept for backward compatibility.

I would recommend being conservative and sticking only to enhancements driven by real-world use cases.

Here is some details about the use case which brought me to raise this issue/PR:

718211$ for d in *; do f=$(/bin/ls $d/| head -n 1); echo -n "$d  "; dcmdump $d/$f | grep 0008,0008; done | grep NORM
anat-scout_acq-64ch_ses-01_512x512.1  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM]             #  26, 5 ImageType
anat-scout_acq-64ch_ses-01_512x512.17  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM]             #  26, 5 ImageType
anat-T1w_ses-01_320x300.9  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM]             #  26, 5 ImageType
anat-T2w_ses-01_320x300.11  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM]             #  26, 5 ImageType
func_task-encodingFaces_acq-MB8_run-01_ses-01_936x936.2  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-encodingFaces_acq-MB8_run-02_ses-01_936x936.3  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-encodingFood_acq-MB8_run-01_ses-01_936x936.4  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-encodingFood_acq-MB8_run-02_ses-01_936x936.5  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-nbackFaces_acq-MB8_run-01_ses-01_936x936.18  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-nbackFaces_acq-MB8_run-02_ses-01_936x936.19  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-recognitionFaces_acq-MB8_run-01_ses-01_936x936.12  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-recognitionFaces_acq-MB8_run-02_ses-01_936x936.13  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-recognitionFood_acq-MB8_run-01_ses-01_936x936.14  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-recognitionFood_acq-MB8_run-02_ses-01_936x936.15  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType
func_task-rest_acq-MB8_run-01_ses-01_936x936.16  (0008,0008) CS [ORIGINAL\PRIMARY\M\ND\NORM\MOSAIC]      #  34, 6 ImageType

so NORM field was set not only for anatomy but also for functional images. So even though the "very" original image was not output, the one we obtained did undergo through normalization procedure in the scanner. BTW, I think I forgot to point to the explanation and pointers Chris Rorden (@neurolabusc ) provided: https://github.com/nipy/heudiconv/issues/266#issuecomment-432662723 .

I will see if there will be some time on our scanner to see if scanning parameters for EPI and DWI allow output of both "original" and some preprocessed ("normalized) data. @kodiweera - may be you know from top of your head?

yarikoptic avatar Dec 13 '18 03:12 yarikoptic

@chrisfilo wrote (apparently can't quote directly from the review comment :-/):

Please provide more language on how proc differs from rec. It would also be useful to revisit the existing examples that recommend using rec for differentiating on scanner motion corrected scans from others.

Good point! Besides duplication in describing _rec- (with motion correction as an example) twice in the subsequent paragraphs (blame me for changes in 79e03e16a209512f4548b20b078379403e9dbfd1, I guess I wanted to compress wording but forgot to remove the original one for _rec), we should indeed review it. So here is my take:

To me, _rec- is about "reconstruction" (k-space -> "physical"? space) algorithm, which could potentially differ. If in-scanner motion correction is done somehow during such transformation (big if), it is a valid information to place into _rec field. But otherwise I would expect _rec to contain information about actual reconstruction applied and not scanner post-processing. So why bother about having _rec? I do know some labs (e.g. Wandell) which I believe do collect k-space data and then do reconstruction by "themselves" (do you run into Brian any time to ask?). With all the "sparse sampling" and multi band, reconstruction question becomes really non-trivial and there could be multiple ways you could achieve final data, and thus _rec- field to discriminate. That is why it should be useful for hardcore MR ppl and physicists and should stay in the specification. Among openneuro datasets which we have as public datalad datasets on http://datasets.datalad.org/?dir=/openneuro I see only one which used _rec- with a label halfhalf, and it is our dataset ;) @snastase -- was "halfhalf" to depict spatial smoothing because there were no other field like _proc more appropriate to use at that time in BIDS?

"Motion correction" sounds IMHO more appropriately to be placed into _proc if it is done "post reconstruction", and the fact that we have listed it in there could as well be a mistake/shortcoming (unless it is truly related to reconstruction). @kodiweera , @neurolabusc - do you know how the motion correction in typically done by the scanner?
There is all those predictive "motion prediction" FOV adjustments I have heard about, but not sure if any is already implemented in the scanner, and then anyways it might not need to be depicted in the name since it would be just the feature of "original" data, and would not be related to reconstruction anyways per se.

So, may be, "motion correction" should be depicted in _proc actually, which I would be happy to adjust for in this PR.

yarikoptic avatar Dec 13 '18 03:12 yarikoptic

Regarding scope - I see the argument for anat and func, but not necessarily for fieldmaps. I would recommend holding off on adding the new keyword for those data types until we have a use case.

chrisgorgo avatar Dec 13 '18 19:12 chrisgorgo

Regarding scope - I see the argument for anat and func, but not necessarily for fieldmaps. I would recommend holding off on adding the new keyword for those data types until we have a use case.

how fieldmaps are special and avoid the doom of being preprocessed by the scanner? ;)

(git-annex)hopa:~/datalad/openneuro[master]git-annex
$> datalad -f '{path} {metadata[bids][ImageType]}' -c datalad.search.index-egrep-documenttype=files search --mode egrep path:.*fmap/.*.nii.gz bids.ImageType:NORM | sed -e "s,$PWD,,g"     
/ds001399/sub-MOCCAG001/ses-02/fmap/sub-MOCCAG001_ses-02_acq-multiband3p0MB2AP_epi.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'NORM', 'MOSAIC']
/ds001399/sub-MOCCAG003/ses-01/fmap/sub-MOCCAG003_ses-01_acq-multiband2p4flatAP_epi.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'NORM', 'MOSAIC']
/ds001399/sub-MOCCAG003/ses-01/fmap/sub-MOCCAG003_ses-01_acq-multiband3p0MB2PA_epi.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'NORM', 'MOSAIC']
/ds001399/sub-MOCCAG003/ses-01/fmap/sub-MOCCAG003_ses-01_acq-multiband3p0MB2AP_epi.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'NORM', 'MOSAIC']
/ds001399/sub-MOCCAG003/ses-01/fmap/sub-MOCCAG003_ses-01_acq-multiband2p4flatPA_epi.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'NORM', 'MOSAIC']
/ds001399/sub-MOCCAG001/ses-02/fmap/sub-MOCCAG001_ses-02_acq-multiband3p0MB2PA_epi.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'NORM', 'MOSAIC']

and found 'NORM' listed in locally acquired regular fieldmap's _magnitude*.json files some spin echo's used for topup fieldmap

yarikoptic avatar Dec 13 '18 22:12 yarikoptic

I see the snark is back. I would appreciate if someone else could take over handling this PR.

chrisgorgo avatar Dec 13 '18 23:12 chrisgorgo

just curious, what snark is back? ;-)

From The Free On-line Dictionary of Computing (05 January 2017) [foldoc]:

  snark
  
     [Lewis Carroll, via the Michigan Terminal System] 1. A system
     failure.  When a user's process bombed, the operator would get
     the message "Help, Help, Snark in MTS!"
  
     2. More generally, any kind of unexplained or threatening
     event on a computer (especially if it might be a boojum).
     Often used to refer to an event or a log file entry that might
     indicate an attempted security violation.  See {snivitz}.
  
     3. UUCP name of snark.thyrsus.com, home site of the Hacker
     {Jargon File} versions 2.*.*.
  
     [{Jargon File}]
  

@chrisfilo I don't want to be pushy, if you insist that better to leave fieldmap out of question - I could do that, and will not resist any longer. I personally just don't see a reason for that just to come back to it later when someone asks or starts sticking those values into other fields (silently). People are masters at coming up with workarounds ;-)

yarikoptic avatar Dec 14 '18 04:12 yarikoptic

_rec vs _proc comes up also for SWI in BEP004, see https://docs.google.com/document/d/1kyw9mGgacNqeMbp4xZet3RnDhcMmf4_BmRgKaOkO2Sc/edit?disco=AAAAD8DstRE

yarikoptic avatar Nov 13 '19 16:11 yarikoptic

and now I am encountering desire to use _proc for some DWI (@dnkennedy use case):

$> for d in 501_26/1[345678]*; do dcmdump $d/*-000001-*.ima | grep SeriesDescrip; done
(0008,103e) LO [ep2d_diff_high35]                       #  16, 1 SeriesDescription
(0008,103e) LO [ep2d_diff_high35_ADC]                   #  20, 1 SeriesDescription
(0008,103e) LO [ep2d_diff_high35_TRACEW]                #  24, 1 SeriesDescription
(0008,103e) LO [ep2d_diff_high35_FA]                    #  20, 1 SeriesDescription
(0008,103e) LO [ep2d_diff_high35_ColFA]                 #  22, 1 SeriesDescription
(0008,103e) LO [ep2d_diff_high35_TENSOR]                #  24, 1 SeriesDescription

And AFAIK those are all derived data/estimates as processed by the scanner software. In BIDS for DWI ATM we do not have any field where they should go: freeform _acq- is there but it is really not to describe those transformations of reconstructed (from k-space) "raw" data. I also feel that _rec- should really be reserved for annotating "reconstruction" of data vs post-processing done at the device after the device (thus language actually should be adjusted).

FWIW I do not see many uses of _rec among openneuro datasets, and those which present are "debatable":
(git-annex)lena:~/datalad/openneuro[master]git
$> datalad -f '{path} {metadata[bids][ImageType]}' -c datalad.search.index-egrep-documenttype=files search --mode egrep 'path:.*/.*_rec-.*\.nii.gz'
/home/yoh/datalad/openneuro/ds000233/sub-rid000001/anat/sub-rid000001_rec-ehalfhalf_T1w.nii.gz N/A
/home/yoh/datalad/openneuro/ds000233/sub-rid000001/anat/sub-rid000001_rec-ehalfhalf_mod-T1w_defacemask.nii.gz N/A
...
/home/yoh/datalad/openneuro/ds001705/sub-000101/ses-baseline/pet/sub-000101_ses-baseline_rec-MLEM_pet.nii.gz N/A
/home/yoh/datalad/openneuro/ds001705/sub-000101/ses-displaced/pet/sub-000101_ses-displaced_rec-MLEM_pet.nii.gz N/A
...
/home/yoh/datalad/openneuro/ds001740/derivatives/sub-pilote2/me_func/sub-pilote2_task-convers_run-03_rec-arithmsum_bold.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'MB', 'TE1', 'ND', 'MOSAIC']
/home/yoh/datalad/openneuro/ds001740/derivatives/sub-pilote2/me_func/sub-pilote2_task-convers_run-03_rec-t2sweighted_bold.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'MB', 'TE1', 'ND', 'MOSAIC']
...
/home/yoh/datalad/openneuro/ds002041/sub-2001/pet/sub-2001_task-rest_acq-fallypride_rec-acdyn_pet_bpnd.nii.gz N/A
/home/yoh/datalad/openneuro/ds002041/sub-2001/pet/sub-2001_task-rest_acq-fallypride_rec-acdyn_pet_bpnd_space-MNI152.nii.gz N/A
...
/home/yoh/datalad/openneuro/ds002156/sub-23638/ses-04/anat/sub-23638_ses-04_acq-mprage_rec-ORIG_run-1_T1w.nii.gz ['ORIGINAL', 'PRIMARY', 'OTHER']
/home/yoh/datalad/openneuro/ds002276/sub-PILOT/ses-01/func/sub-PILOT_ses-01_task-StrangerThingsS01E01_rec-magnitude_run-01_echo-1_sbref.nii.gz ['ORIGINAL', 'PRIMARY', 'M', 'TE1', 'ND', 'MOSAIC']
/home/yoh/datalad/openneuro/ds002276/sub-PILOT/ses-01/func/sub-PILOT_ses-01_task-StrangerThingsS01E01_rec-phase_run-01_echo-1_sbref.nii.gz ['ORIGINAL', 'PRIMARY', 'P', 'TE1', 'ND', 'MOSAIC']
...

Dear @bids-standard/bep_leads (since I bet you have similar use cases for other modalities), @bids-standard/derivatives (since it might relate to annotating derived data), @bids-standard/steering -- please provide feedback. Also @francopestilli , since you worked a lot with DWI, what do you think/how do you annotate "derived on the scanner" func/, dwi/ and other volumes? Should we keep "abusing" _rec and _acq or use more descriptive/generic _proc?

yarikoptic avatar Feb 17 '20 00:02 yarikoptic

thinking about it - probably aforementioned dwi processed files, shouldn't be a mere _proc or _rec -- they are really derived data, not mere "slightly preprocessed original data", so probably shouldn't even have the _dwi suffix, but rather get a dedicated ones (_fa, etc). Decision should be "aligned" with "common derivatives" approach in https://github.com/bids-standard/bids-specification/pull/265

yarikoptic avatar Feb 17 '20 15:02 yarikoptic

I'd like to contribute with my application. I'm converting older datasets to BIDS. Unfortunately, the Prescan Normalize filter on the scanner wasn't used consistently and it seems not always all data was exported. This resulted in some subjects having both normalized and not normalized functional images, and some having one of them. I want to document that but I couldn't make up my mind whether to put this information in the file name, or the json file, or keep only normalized images when exist and not normalized otherwise. Fortunately, I found this PR. Could anyone report if the _proc will be added to the specification or not and what to use until then? I'd like to be consistent for the datasets I'm working with. Disclaimer: I'm not an mri person.

mateuszpawlik avatar Sep 15 '21 11:09 mateuszpawlik

I'm also handling a number of images from our Siemens scanner right now, wondering what to do with the NORM image type and how to distinguish them from raw/non-normalized images.

Can anyone share if there are new insights into the issue? Otherwise, I would really like to see this merged :)

octomike avatar Apr 22 '22 11:04 octomike

@mateuszpawlik @octomike for those cases you could use the rec- option.
from the sepc: 'the OPTIONAL rec- key/value can be used to distinguish different reconstruction algorithms'

  • the prescan normalize (image type NORM\ND) is a normalization filter applied at the reconstruction stage (but @yarikoptic will argue not from k-space so it's _proc)
  • the gradient distortion correction (image type NORM\DIS2D or image type NORM\DIS3D)) is a correction also done at reconstruction (although can also be done after)

example: sub-0158_ses-baseline_acq-MPRAGE_rec-normdistcorr_T1w.nii

CPernet avatar Apr 22 '22 13:04 CPernet

a more general comment @yarikoptic is should anything coming out of the scanner be seen as raw (which seems to be the consensus, including those filtered, corrected images and I store them using _rec) or should this be under derivatives (for which _proc works nicely)

CPernet avatar Apr 22 '22 13:04 CPernet

should anything coming out of the scanner be seen as raw

FWIW in one of the latest BIDS workshops I attended, this is the least inappropriate definition we could come up with for what raw means.

In other contexts, whatever the scanner spits out is referred to as unprocessed (despite the fact that lots of processing happens within the scanner).

Regarding how this applies to this case I have no idea, but by reading Cyril's comment, it seems that _rec is the right call.

oesteban avatar Apr 22 '22 13:04 oesteban

least inappropriate oh my :-)

CPernet avatar Apr 22 '22 13:04 CPernet

Please keep in mind that BIDS is meant to meet researchers where they are. As inappropriate as it may be, nearly every MRI researcher refers to the reconstructed images coming off of the scanner as "raw" - using different terminology would just sow confusion.

On Fri, Apr 22, 2022 at 6:39 AM Oscar Esteban @.***> wrote:

should anything coming out of the scanner be seen as raw

FWIW in one of the latest BIDS workshops I attended, this is the least inappropriate definition we could come up with for what raw means.

In other contexts, whatever the scanner spits out is referred to as unprocessed (despite the fact that lots of processing happens within the scanner).

Regarding how this applies to this case I have no idea, but by reading Cyril's comment, it seems that _rec is the right call.

— Reply to this email directly, view it on GitHub https://github.com/bids-standard/bids-specification/pull/105#issuecomment-1106526406, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVECO5EM4EBEJ25X3TTDVGKTZ5ANCNFSM4GKA4JKQ . You are receiving this because you are on a team that was mentioned.Message ID: @.***>

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305

@.*** @.***> http://www.poldracklab.org/

poldrack avatar Apr 22 '22 13:04 poldrack

Seems like a good time to make a decision, IMO. We've clearly been living without this, and @CPernet offers a solution that does not involve adding more entities to file names.

Is rec-<label> (e.g., rec-Norm for normalized on-scanner, or rec-NormMC for normalized and motion corrected on-scanner) acceptable even if not entirely satisfying? Or is there a compelling use case that makes this use of rec-<label> untenable?

If we do make this change, we could note in the rec- definition that relevant on-scanner processing MAY be indicated in this entity.

effigies avatar Apr 22 '22 14:04 effigies

This is related to dcm2niix issue 597. In particular, see comments from @mharms. If users want to see explicit support for dcm2niix they will need to provide a sample DICOM series where one image has this switched on and the other has it switched off.

neurolabusc avatar Apr 22 '22 14:04 neurolabusc

@neurolabusc I will push soon phantom dicom files with different image types (been easter break here)

CPernet avatar Apr 22 '22 14:04 CPernet

Another use-case: I am working with an MR physics group that is using different reconstruction algorithms for DWI data. However, it seems like the optional _rec-label is not present for dwi. Could the _rec-label perhaps be added in this pull request to fix that? Note that this is a reconstruction case and is not a processing step that is done on the scanner and I don't think the proc-label is appropriate for this.

dorahermes avatar Apr 22 '22 15:04 dorahermes

However, it seems like the optional _rec-label is not present for dwi. Could the _rec-label perhaps be added in this pull request to fix that? Note that this is a reconstruction case and is not a processing step that is done on the scanner and I don't think the proc-label is appropriate for this.

Suspect this is better tackled in a separate PR as this seems an easier fix (applying an existing modality to an existing datatype rather than adding a new entity). Also better chances to get solved faster that way because this PR almost goes back to the prehistory of bids

Remi-Gau avatar Apr 22 '22 16:04 Remi-Gau

Wow, what a popular PR this one became on this Friday! ;) Just to clear up on where I stand in raw-vs-derivative, as @CPernet mentioned that -- I can only repeat that AFAIK all data we deal with in MRI is derived. Indeed though "raw BIDS" data are largely obtained from the scanner. Largely because we enhance it with extra metadata not coming from MRI (and possibly recomputed/re-entered, etc -- derived), we do basic minimal processing (e.g. defacing), we do data formats conversion (DICOM -> NIfTI) etc. So, yes, we can keep the claim it is raw but we all well know it is not. And all those varieties of preprocessed data coming out from MRI (normalized, motion corrected, smoothed, could even be contrast maps, etc) attest to that point. So -- the question is either we should associate all options of operation of "reconstruction server in MRI" as the entire thing responsible for _rec thus diluting its value/meaning IMHO, or do accept the matter of the fact that "reconstruction server" can do more than reconstruction but also motion correction, etc and allow for _proc to disambiguate multiple variants of data (which could be done also "manually" in derived data) while leaving _rec for its clear original purpose?
I would vote for the latter. I will redo this historical PR now for the same effect as originally but also will extend it for DWI for @dorahermes since I really do not see why we need to discriminate MRI modalities.

yarikoptic avatar Apr 22 '22 18:04 yarikoptic

Codecov Report

Patch and project coverage have no change.

Comparison is base (03a569d) 88.01% compared to head (467a9b4) 88.01%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #105   +/-   ##
=======================================
  Coverage   88.01%   88.01%           
=======================================
  Files          14       14           
  Lines        1268     1268           
=======================================
  Hits         1116     1116           
  Misses        152      152           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Apr 22 '22 19:04 codecov[bot]

Thanks to the glorious schema I could accomplish the world-domination mission of _proc across all modalities which have reconstruction. Moreover I noted following "workaround" implemented for PET here:

If multiple reconstructions of the data are made with the same type of reconstruction,
a number MAY be appended to the label, for example `rec-acdyn1` and `rec-acdyn2`.

which smells like a workaround to me, or is reconstruction is not deterministic in PET? (attn @bids-standard/bep009 ) would _proc be useful entity to depict the difference between instances of the same reconstruction may be?

edit: meg (but not eeg...) already had processing

edit 2: submitted a PR to make use of that latin checker easier

edit 3: I have not lived up to my promise to @dorahermes on adding proc to dwi because it did not even have _rec. IMHO dwi deserves both _rec and _proc since they correspond to different aspects. pushed now

yarikoptic avatar Apr 22 '22 19:04 yarikoptic

Humbly checking in again on this one.. :see_no_evil:

Is there a chance we can move this (much awaited) PR forward? Is there any time consuming work that can be outsourced? I'd be happy to help out with anything!

octomike avatar Dec 16 '22 13:12 octomike

@octomike well -- the issue seems to be not technical but social. As a quick workaround, you can just follow the statements above to (ab)use _rec- for anything done in the scanner (similarly to _acq- typically abused to encapsulate any difference in acquisition).

Meanwhile we would have **only** MEG data (not even EEG) to be capable of carrying `_proc-` for some reason
❯ git grep -B8 'processing: '
src/schema/README.md-  datatypes:
src/schema/README.md-    - meg
src/schema/README.md-  entities:
src/schema/README.md-    subject: required
src/schema/README.md-    session: optional
src/schema/README.md-    task: required
src/schema/README.md-    acquisition: optional
src/schema/README.md-    run: optional
src/schema/README.md:    processing: optional
--
src/schema/README.md-  datatypes:
src/schema/README.md-    - meg
src/schema/README.md-  entities:
src/schema/README.md-    subject: required
src/schema/README.md-    session: optional
src/schema/README.md-    task: required
src/schema/README.md-    acquisition: optional
src/schema/README.md-    run: optional
src/schema/README.md:    processing: optional
--
src/schema/rules/files/raw/channels.yaml-
src/schema/rules/files/raw/channels.yaml-# MEG has an additional entity available
src/schema/rules/files/raw/channels.yaml-channels__meg:
src/schema/rules/files/raw/channels.yaml-  $ref: rules.files.raw.channels.channels
src/schema/rules/files/raw/channels.yaml-  datatypes:
src/schema/rules/files/raw/channels.yaml-    - meg
src/schema/rules/files/raw/channels.yaml-  entities:
src/schema/rules/files/raw/channels.yaml-    $ref: rules.files.raw.channels.channels.entities
src/schema/rules/files/raw/channels.yaml:    processing: optional
--
src/schema/rules/files/raw/meg.yaml-  datatypes:
src/schema/rules/files/raw/meg.yaml-    - meg
src/schema/rules/files/raw/meg.yaml-  entities:
src/schema/rules/files/raw/meg.yaml-    subject: required
src/schema/rules/files/raw/meg.yaml-    session: optional
src/schema/rules/files/raw/meg.yaml-    task: required
src/schema/rules/files/raw/meg.yaml-    acquisition: optional
src/schema/rules/files/raw/meg.yaml-    run: optional
src/schema/rules/files/raw/meg.yaml:    processing: optional
--
src/schema/rules/files/raw/task.yaml-    ceagent: optional
src/schema/rules/files/raw/task.yaml-
src/schema/rules/files/raw/task.yaml-timeseries__meg:
src/schema/rules/files/raw/task.yaml-  $ref: rules.files.raw.task.timeseries
src/schema/rules/files/raw/task.yaml-  datatypes:
src/schema/rules/files/raw/task.yaml-    - meg
src/schema/rules/files/raw/task.yaml-  entities:
src/schema/rules/files/raw/task.yaml-    $ref: rules.files.raw.task.timeseries.entities
src/schema/rules/files/raw/task.yaml:    processing: optional

While at it - coming up with potential future complimentary cases -- if ever ultrasound is supported (some work seems to be done in the scope of BEP025 - MIDS) there might also clear separation between _rec (how image reconstructed from whatever domain data instrumentally acquired in - frequencies, coils, whatnot) and _proc (how it processed after brought into the output data domain - either it is an image or channel). Moreover, for EEG/MEG, some hardware might (if not already, don't know) be fancy to produce some 3D reconstruction via inverse problem solution (e.g. LORETA etc) of which there is a good number of approaches (min norm, beamforming, etc) which are different conceptually on how then reconstructed image is processed (the point of _proc-) before output .

So, altogether, I think it would be of benefit to have clearly separated _rec and _proc and neither clump them into a single entity for MRI, nor claim that _proc in MEG (only) is like _rec in MRI. So I have updated this PR once again - may be opinions have changed since then. I also made MEG channels less special ;)

yarikoptic avatar Dec 16 '22 22:12 yarikoptic

Agree with @yarikoptic that there is value to distinct _rec and _proc tags and that there shouldn't be different interpretations of each for different modalities. I definitely like how @yarikoptic distiguished them, which I would paraphrase

  • _rec: variant of method for transforming raw data into output domain
  • _proc: fundamental processing applied to data in output domain

The only thing that is perhaps missing in the current PR is an elaboration of what is 'fundamental', i.e. how we make it clear that ideally, generally post-acquistion manipulations are the domain of BIDS derivaties, but it is understood some manipulations are so common and that data without those manipulations are regarded as having little or reduced value and hence these "processing" manipulations deserve to be part of a core BIDS instance.

One other idea: Should we add references to _proc in PET to suggest how they could be used? (e.g. different filters applied to a ramp-recon'd image). Maybe even MRI (though I can't think of any use cases myself).

nicholst avatar Dec 17 '22 11:12 nicholst

FWIW, while thinking about "references" @nicholst mentioned, decided to look at what _rec we see already in openneuro datasets.

I might have not the most recent set of datasets, but here is what I saw
(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro[master]
$> find -iname *_rec-* | sed -e 's,.*\(_rec-[^_]*\)_.*,\1,g' | sort | uniq -c | sort -n
      2 _rec-t2sweighted
      4 _rec-arithmsum
      6 _rec-ORIG
     24 _rec-deface
     36 _rec-ehalfhalf
     64 _rec-phase
     68 _rec-DynTOF
     72 _rec-1
     72 _rec-2
    101 _rec-acdyn
    152 _rec-dico7Tad2grpbold7Tad
    152 _rec-dico7Tad2grpbold7TadBrainMask
    152 _rec-dico7Tad2grpbold7TadBrainMaskNLBrainMask
    152 _rec-dico7Tad2grpbold7TadNL
    152 _rec-dico7Tad2grpbold7TadNLWarp
    152 _rec-XFMdico7Tad2grpbold7Tad
    230 _rec-swi
    342 _rec-complex
    342 _rec-mag
    514 _rec-NORM
    764 _rec-magnitude
   1215 _rec-SCIC
   1224 _rec-dico

which

  • reminded me that we used to (ab)use _rec- for what we store now in _part- -- phase, magnitude https://github.com/nipy/heudiconv/pull/477
  • discovered scic, random first google hit: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5471916/ ."surface coil intensity correction (SCIC), which is used to improve the image homogeneity of magnetic resonance imaging when a phased-array surface coil is used for reception." " TheSCIC feature is an automatic post-processing technique that reduces noise, and enhances contrast in the image. " -- so seems another _proc really
  • pointed to fact that in PET we already use _rec with some fixed reserved values! https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/09-positron-emission-tomography.html : acdyn, acstat, nacdyn, nacstat. @bids-standard/bep009 -- I wonder if those are really "reconstruction" or "postprocessing" related?
  • that _rec-dico happens in ds001293
(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro/ds001293[master]git
$> ls ./sub-21/ses-r30/func/*run-02*nii.gz
./sub-21/ses-r30/func/sub-21_ses-r30_task-orientation_rec-dico_run-02_bold.nii.gz@
./sub-21/ses-r30/func/sub-21_ses-r30_task-orientation_run-02_bold.nii.gz@

@mih -- do you recall what _rec-dico was used to depict ? (git grep dico is silent)

yarikoptic avatar Dec 19 '22 19:12 yarikoptic

The only thing that is perhaps missing in the current PR is an elaboration of what is 'fundamental', ...

I would have said "any postprocessing performed by/on hardware providing "raw data" for BIDS and a selected set of operations done after: at the moment only defacing`. But I do not want to open discussion here about "defacing" to be needed or not to be explicitly annotated in BIDS file names ;-)

yarikoptic avatar Dec 19 '22 20:12 yarikoptic