fmriprep ENH: Add MP-PCA/NORDIC denoising through dwidenoise

Closes #3308.

Potential blockers

dwidenoise does not support NORDIC yet, but there are plans to add it (see https://github.com/MRtrix3/mrtrix3/issues/3031). In the meantime, MP-PCA is the only option.
dwidenoise doesn't accept a precalculated noise map yet, which limits reproducibility if users want to apply the denoising step using the output noise map using precalculated data.
MRTrix3 doesn't have a dwi2noise command to estimate the noise map from no-RF volumes, so I've disabled no-RF processing for now.
Tracking lost degrees of freedom is important and dwidenoise has a -rank option to output a DOF map, but that feature hasn't been released yet. See https://github.com/MRtrix3/mrtrix3/issues/2312.
I'm not sure yet what figures we should generate for the denoising procedure. I think stat-maps of the noise map, DOF map, and possibly variance-explained map (which we'll need to generate) would be good. Maybe a histogram of the noise values?
The MRTrix3 team recommends concatenating data across echoes before running dwidenoise, and they ultimately plan to support 5D NIfTIs, but my implementation currently just applies the denoising step independently to each echo. See https://github.com/MRtrix3/mrtrix3/issues/3021.
I need to add an appropriate filename for the noise map to the niworkflows specification (pending https://github.com/nipreps/niworkflows/pull/899).

Changes proposed in this pull request

Create a workflow to apply MP-PCA or NORDIC denoising to BOLD data. This is done right before slice timing correction in init_bold_native_wf.
Add a new parameter, --thermal-denoise-method to specify which approach to use, if any.
Add "phase" and "norf" options to the ignore parameter to let users skip those elements of the thermal denoising procedure.

Nov 09 '24 16:11 tsalo

Codecov Report

Attention: Patch coverage is 53.81356% with 109 lines in your changes missing coverage. Please review.

Project coverage is 71.35%. Comparing base (ed8804e) to head (fa3c305). Report is 110 commits behind head on master.

Files with missing lines	Patch %	Lines
fmriprep/workflows/bold/fit.py	35.71%	24 Missing and 12 partials :warning:
fmriprep/interfaces/denoise.py	70.00%	33 Missing :warning:
fmriprep/workflows/bold/denoise.py	42.00%	24 Missing and 5 partials :warning:
fmriprep/workflows/bold/outputs.py	0.00%	6 Missing and 1 partial :warning:
fmriprep/utils/bids.py	60.00%	2 Missing and 2 partials :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3395      +/-   ##
==========================================
- Coverage   72.33%   71.35%   -0.99%     
==========================================
  Files          57       59       +2     
  Lines        4269     4503     +234     
  Branches      545      576      +31     
==========================================
+ Hits         3088     3213     +125     
- Misses       1065     1154      +89     
- Partials      116      136      +20

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Nov 09 '24 16:11 codecov[bot]

dwidenoise doesn't accept a precalculated noise map yet, which limits reproducibility if users want to apply the denoising step using the output noise map using precalculated data.

It is intended to address this at some stage; for tracking see https://github.com/MRtrix3/mrtrix3/issues/2274 and https://github.com/MRtrix3/mrtrix3/issues/3035.

MRTrix3 doesn't have a dwi2noise command to estimate the noise map from no-RF volumes, so I've disabled no-RF processing for now.

I've added a note to https://github.com/MRtrix3/mrtrix3/issues/3035.

Tracking lost degrees of freedom is important ...

Agree 100%; we're invested locally in being able to modulate downstream applications appropriately based on this. It would be good to:

Export noise and rank images to the HTML for visual assessment.
Apply estimated transforms to the rank image and aggregate, to produce an approximate signal rank per output element (ie. needs to be propagated for each output space, including surfaces)

dwidenoise has a -rank option to output a DOF map, but that feature hasn't been released yet.

There's a few branches in parallel that have proposed changes to dwidenoise that may be on the dev development branch, or are not even there yet let alone on master. If need be I can aggregate multiple such changes onto a dedicated branch that fMRIPrep can then clone.

... possibly variance-explained map (which we'll need to generate) ...

That could be done. It would be reasonably easy to add an export option to dwidenoise rather than trying to compute it after the fact. Do you want to make a feature request?

The MRtrix3 team recommends concatenating data across echoes before running dwidenoise, and they ultimately plan to support 5D NIfTIs, but my implementation currently just applies the denoising step independently to each echo.

When I've done this myself, I've concatenated data across echoes into the fourth dimension, denoised, and then separated the volumes back out into echoes. That is I think more principled given that the noise level should be identical. It's not super difficult to do. You may however want to clamp the maximal patch size: once (# echoes x # volumes) exceeds 343 the default changes to a 9x9x9 patch and that runs very slowly. This behaviour should be better with https://github.com/MRtrix3/mrtrix3/issues/2742 / https://github.com/MRtrix3/mrtrix3/pull/3029, which won't have such step changes in patch size.

Some other points to consider:

I found the denoising under default usage to be imperfect for fMRI data due to the large offset from zero. This seems to be possible to mitigate by either:
- Demeaning the data prior to PCA: https://github.com/MRtrix3/mrtrix3/pull/2363 This however would benefit from confirmation that it does not introduce any kind of bias into the noise estimator / reported rank.
- Utilising double-precision floating-point. This can already be chosen at the command-line, but obviously comes with a performance penalty.
- Using a more stable PCA method: https://github.com/MRtrix3/mrtrix3/pull/2906 This also comes with a performance penalty.
The dwidenoise command has a different copyright to the default MRtrix3 one. There is also a patent owned by NYU regarding the denoising of a 4D series based on the MP distribution. There's been internal MRtrix3 discussion on whether the scope of that patent is restricted to the specific rank estimator presented in the original article, and therefore does not cover the alternative estimator in use by default as published in this work, but I don't think we reached a definitive conclusion. Pinging @jdtournier, @dchristiaens, @jelleveraart for full transparency / invitation of opinions.

Nov 11 '24 00:11 Lestropie

Thanks @Lestropie for looping me in this conversation.

I agree that accepting precomputed noise maps is an impactful feature, not only because it might accelerate the algorithm but also improve its robustness in case of correlated noise (https://doi.org/10.1162/imag_a_00049). It would further allow denoising of individual series without the need for concatenating them.
Similarly, I wonder whether accepted the "rank" as an input might be of interest for certain application in which preserving homoscedasticity is more critical than maxizing precision.
Concatenating imaging series with varying echos in the fourth dimension is conceptually a proper strategy under the assumption of identical noise levels. However, as suggested by @Lestropie, it is recommended to keep the patch size relatively small if the noise level is spatially varying. The 5x5x5 is commonly used. If the individual echo series have > 125 images, then denoising the individual series might actually be preferred as it does not impose any risk of masking subtle signal features across the series. An approach could be to (a) concatenate all series to estimate the noise level; (b) use the resulting noise map as an input to denoise the individual series.
We have previously used the demeaning step, but I will confirm that there is not risk of introducing biases or lowering the robustness of the current algorithm.

Nov 11 '24 18:11 jelleveraart

I wonder whether accepted the "rank" as an input might be of interest for certain application in which preserving homoscedasticity is more critical than maximizing precision.

Posted as https://github.com/MRtrix3/mrtrix3/issues/3046. Would need implementation within the underlying denoising command; whether fMRIPrep would want to utilise that is beyond my control.

it is recommended to keep the patch size relatively small if the noise level is spatially varying

It might be possible to mitigate this if the full noise map profile is known a priori, and voxels can be normalised to unit variance upon insertion into the Casorati matrix? https://github.com/MRtrix3/mrtrix3/issues/3041

The 5x5x5 is commonly used.

Have more flexible patch size in this implementation with spherical kernels: https://github.com/MRtrix3/mrtrix3/issues/2742

We have previously used the demeaning step, but I will confirm that there is not risk of introducing biases or lowering the robustness of the current algorithm.

Cool; it would be useful for me to know from someone more familiar with the mathematics / simulations that it's sensible.

Nov 25 '24 03:11 Lestropie

Thanks @Lestropie and @jelleveraart!

Do you want to make a feature request?

Sure! I'll open an issue requesting a variance explained output map.

I found the denoising under default usage to be imperfect for fMRI data due to the large offset from zero.

In NORDIC they divide by the min value in the first volume. Maybe that's why they do it?

I now have a line-by-line Python translation of the MATLAB code that seems to work pretty well, so that might make it easier to translate to C++. There are a bunch of filtering steps that might not be in dwidenoise already.

The dwidenoise command has a different copyright to the default MRtrix3 one.

I didn't realize that! That could be an issue for fMRIPrep. I think the devs avoid anything that doesn't allow for commercial or clinical uses, but I could be misremembering. Do you know if the optimal shrinkage approaches it looks like you're implementing in dwidenoise would fall under the same copyright?

Jan 02 '25 18:01 tsalo

I found the denoising under default usage to be imperfect for fMRI data due to the large offset from zero.

In NORDIC they divide by the min value in the first volume. Maybe that's why they do it?

Even if that is indeed why they do it, it nevertheless seems to me an unusual choice of how to do it. You could get data from the scanner where the raw intensities are enormous, but you would only need one isolated voxel in one volume with a value that is close-to-but-not-quite-zero and that rescaling step will not serve the purported purpose.

I had originally proposed an explicit demeaning step in https://github.com/MRtrix3/mrtrix3/pull/2363. That is being superseded by https://github.com/MRtrix3/mrtrix3/pull/3029, which for DWI does a demeaning per shell; for fMRI it should just regress out, for each voxel, the mean value across all volumes. There's a potential improvement to be had here for demeaning of multi-echo fMRI data using the same approach.

Do you know if the optimal shrinkage approaches it looks like you're implementing in dwidenoise would fall under the same copyright?

I believe no. It wasn't a part of the NYU work, and I think it's not included in their patent. However I'm not fully across these bureaucratic complications. There have previously been discussions / questions around the scope of what is protected. I've started a separate discussion thread for that: https://github.com/MRtrix3/mrtrix3/issues/3059

Jan 05 '25 04:01 Lestropie

fmriprep fmriprep copied to clipboard

ENH: Add MP-PCA/NORDIC denoising through dwidenoise

Changes proposed in this pull request

Codecov Report

fmriprep
fmriprep copied to clipboard