Add coordinate-based coactivation-based parcellation class

Open tsalo opened this issue 3 years ago • 4 comments

Closes #260. Tagging @DiveicaV in case she wants to look at this.

We are using Chase et al. (2020) as the basis for our general approach- especially the metrics we're using for kernel and order selection.

EDIT: A recommendation from @SBEickhoff is to look at Liu et al. (2020) and Plachti et al. (2019) as well.

To do:

[x] Support lists of values for r and n parameters. These correspond to the "filter sizes" in Chase et al. (2020).
[ ] Determine clustering options
[ ] Filter size selection step
[ ] Metric: misclassified voxels
[ ] Metric: variation of information
[ ] Metric: silhouette value
[ ] Metric: percentage of voxels not related to the dominant parent cluster
[ ] Metric: change in inter- versus intra-cluster distance ratio
[ ] Refactor to easily support ImageCBP and MAMP with limited code duplication
[ ] Tests
[ ] Documentation

Changes proposed in this pull request:

Add n option to Dataset.get_studies_by_coordinate().
Draft new parcellate module with CoordCBP class.

Jun 30 '21 16:06 tsalo

@mriedel56 @62442katieb if possible, I'd love it if you could check out the new class (especially the _fit method, which does the actual CBP) and give your thoughts. So far, I just have the most basic elements of the algorithm implemented, so I still need input on (1) the clustering algorithm options, (2) the metrics to use, and (3) the outputs to save.

Ultimately, I want this class to be fairly basic, meaning not including too many tunable parameters, with some documentation pointing toward cbptools for users who require more control.

Additional questions:

Should we run PCA before clustering? From the sklearn clustering user guide:

in very high-dimensional spaces, Euclidean distances tend to become inflated (this is an instance of the so-called “curse of dimensionality”). Running a dimensionality reduction algorithm such as Principal component analysis (PCA) prior to k-means clustering can alleviate this problem and speed up the computations.
Do we want to leverage sample weights at all? E.g., by weighting by studies' sample sizes?
How do we want to structure our outputs? The label maps can go in a standard MetaResult, but we have additional information, like filter selection ranges and metrics, that we probably want to output as well.

Jun 30 '21 16:06 tsalo

Codecov Report

Base: 88.55% // Head: 84.29% // Decreases project coverage by -4.26% :warning:

Coverage data is based on head (0c60dd5) compared to base (e269941). Patch coverage: 7.89% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #533      +/-   ##
==========================================
- Coverage   88.55%   84.29%   -4.27%     
==========================================
  Files          38       36       -2     
  Lines        4370     4069     -301     
==========================================
- Hits         3870     3430     -440     
- Misses        500      639     +139

Impacted Files	Coverage Δ
nimare/parcellate.py	`0.00% <0.00%> (ø)`
nimare/dataset.py	`90.33% <100.00%> (+0.37%)`	:arrow_up:
nimare/utils.py
nimare/base.py
nimare/__init__.py

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Jul 12 '21 22:07 codecov[bot]

https://github.com/neurosynth/neurosynth/blob/master/neurosynth/analysis/cluster.py

Apr 20 '22 19:04 adelavega

@62442katieb has some code from her naturalistic meta-analysis that may implement some of these metrics: https://github.com/62442katieb/meta-analytic-kmeans/blob/daf3904caad990aeadc89bc98769aaed32857e09/evaluating_clustering_solutions.ipynb

Apr 20 '22 19:04 tsalo

NiMARE NiMARE copied to clipboard

Add coordinate-based coactivation-based parcellation class

Codecov Report

NiMARE
NiMARE copied to clipboard