tedana Align with old meica

Closes #929.

While working on the decision tree modularization, we noticed that the tedana decision tree had a difference from the older MEICA decision tree. Components that used to be classified as midK could later be ignored rather than rejected, but in tedana, once they were rejected, they couldn't later be ignored. More details are in #929. This PR should address this issue.

Changes proposed in this pull request:

kundu.json was copied to tedana_orig.json
meica.json is similar to kundu.json, but was edited to include a provisionalreject intermediate classification. This is what was originally called mid kappa or midk That is used to allow the same components to be evaluated across multiple nodes and eventually rejected. This should match the decision tree steps in MEICA v2.5.
Several people compared the results of tedana_orig and meica. While tedana_orig rejects more, everything it rejected seemed reasonable to reject. For now tedana_orig (formerly kundu) will remain the default and this with NOT be labeled a breaking change
The [tedana_only/meica].[tex/png] flow charts were updated to reflect the above changes
- [ ] references.bib now points to doi = {10.6084/m9.figshare.25251433.v2} but version 2 does not yet exist. Once others are happy with these new tex/png files I'll upload them to figshare to make version 2.
Documentation explaining this change and the relationship between the three decision trees was updated in multiple .rst files and in function doc strings.
If someone tries to run tedana using kundu as the three it will log a warning and then run using tedana_only
All 3 json files for the decision trees were cleaned up to remove empty log_extra_info fields. When Black was run on these json files, it make additional style changes.
The _comment text in tedana_orig.json was also edited to highlight the points where it differs from meica.json

Jun 08 '23 21:06 handwerkerd

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 89.68%. Comparing base (9cbd484) to head (6270013).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #952      +/-   ##
==========================================
+ Coverage   89.59%   89.68%   +0.09%     
==========================================
  Files          26       26              
  Lines        3478     3482       +4     
  Branches      614      615       +1     
==========================================
+ Hits         3116     3123       +7     
+ Misses        213      211       -2     
+ Partials      149      148       -1

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Jun 08 '23 22:06 codecov[bot]

@n-reddy @dowdlelt Do either of you have a chance to run this on a bunch of data during the next few weeks? There will be differences, but I want to make sure the differences are relatively minor.

Aug 15 '23 18:08 handwerkerd

Yes, I can test this. To confirm, you want to compare the kundu tree modified here to the kundu tree that is used in the current version of tedana?

Aug 15 '23 19:08 n-reddy

Yes, I can test this. To confirm, you want to compare the kundu tree modified here to the kundu tree that is used in the current version of tedana?

Correct. Compare current kundu tree to the kundu tree used in this PR. Thank you.

Aug 15 '23 21:08 handwerkerd

I'll try and jump on this at some point, but might be just a bit... Sounds straightforward enough

Aug 17 '23 16:08 dowdlelt

@n-reddy and I ran this on distinct datasets. We're both seeing the same thing. As expected, this PR is either identical to main or is accepting some components that are currently being rejected by main. I was assuming the total changed variance would be fairly small, but that's not always true.

Using my dataset as an example, I have 170 runs from 25 subjects. 112 runs are identical. 58 are accepting additional components with variance explained ranging from 0.15% to 41% (freq data below). In a spot check of my data, most of the large variance components that are now being accepted look like they should have been rejected

That means this is NOT a trivial change to the output. I propose we do one of the following options:

The kundu tree should be what Kundu intended, we should merge this PR.
In addition to merging this into the kundu tree, we should create a third default tree that's what tedana has been using for years so that people can continue to replicate past results (and they might like the slightly more aggressive removal)
Keep kundu as is so that people don't suddenly see different results and create a third default tree (meica?) so that people can replicate the meica method.

For any of those options, we should also add explanations to the documentation.

Thoughts? If we don't have clear agreement by our 9/21 dev call, this will be on the agenda.

Var Explained  # runs
0	           3
0.5	           6
1	           11
1.5	           6
2	           4
5.5	           3
3	           1
6.5	           3
4	           1
7.5	           2
5	           1
8.5	           0
6	           0
9.5	           1
7	           2
10.5	           1
8	           1
11.5	           0
9	           1
12.5	           0
10	           11

Sep 06 '23 17:09 handwerkerd

I updated the code and documentation to include both our current and the original meica trees. I'm calling them meica and tedana_v0.013 That name might not stick, but I wanted to have a placeholder label that people can run. Better naming ideas are welcome.

Anyone should be able to compare branches by using this branch and then the --tree flag to select one of the two trees. I'd still recommend running one first and then using --mix apply the identical ICA mixing matrix to the other decision tree. https://github.com/jbteves/dtm_tools can be used to compare results. The only change should be that some components rejected by tedana_v0.013 should be accepted by meica. I expect to see non-trivial variance explained by those components. I'd love people to look at the components that change and, particularly for higher variance components, decide if they were better to accept/reject. The default tree is currently tedana_v0.013 but if it turns out that tree is rejecting components that were plausible to accept, we might want to set the default to meica. I can explain more, if useful.

@marco7877 @BahmanTahayori and @goodalse2019 said they'd try to find time to do this. Others are very welcome too.

Sep 21 '23 20:09 handwerkerd

Checking if @marco7877 @BahmanTahayori and @goodalse2019 or others have had a chance to run and compare the three branches before our scheduled Oct 17 dev call. Besides coming up with a better tree name than tedana_v0.013 once enough people kick this around and make sure it's ok, it should be ready to merge.

Oct 12 '23 15:10 handwerkerd

I made all the changes we discussed a few months ago and this should now be ready to review & hopefully merge. I updated the opening comment to detail everything that is or isn't changing there. Of note, I originally called this a breaking change, but, since we decided to keep the same tree as our default, this is no longer a breaking change.

Since this is both some code changes and explanatory documentation, I wouldn't mind a few extra eyes to make sure the documentation is clear. Having a few people run this to make sure nothing surprising happens on real data would also be useful. (@tsalo @eurunuela @dowdlelt @goodalse2019 @n-reddy )

Feb 23 '24 20:02 handwerkerd

@n-reddy and @dowdlelt Could you look this over and run it with the tedana and meica trees to make sure everything looks good? If there are any issues, I'd like to be able to discuss at the dev call next week.

Mar 12 '24 18:03 handwerkerd

@n-reddy Can you re-approve after the last few updates?

Mar 21 '24 18:03 handwerkerd

tedana tedana copied to clipboard

Align with old meica

Codecov Report

tedana
tedana copied to clipboard