NiMARE icon indicating copy to clipboard operation
NiMARE copied to clipboard

Add coordinate-based coactivation-based parcellation class

Open tsalo opened this issue 3 years ago • 4 comments

Closes #260. Tagging @DiveicaV in case she wants to look at this.

We are using Chase et al. (2020) as the basis for our general approach- especially the metrics we're using for kernel and order selection.

EDIT: A recommendation from @SBEickhoff is to look at Liu et al. (2020) and Plachti et al. (2019) as well.

To do:

  • [x] Support lists of values for r and n parameters. These correspond to the "filter sizes" in Chase et al. (2020).
  • [ ] Determine clustering options
  • [ ] Filter size selection step
  • [ ] Metric: misclassified voxels
  • [ ] Metric: variation of information
  • [ ] Metric: silhouette value
  • [ ] Metric: percentage of voxels not related to the dominant parent cluster
  • [ ] Metric: change in inter- versus intra-cluster distance ratio
  • [ ] Refactor to easily support ImageCBP and MAMP with limited code duplication
  • [ ] Tests
  • [ ] Documentation

Changes proposed in this pull request:

  • Add n option to Dataset.get_studies_by_coordinate().
  • Draft new parcellate module with CoordCBP class.

tsalo avatar Jun 30 '21 16:06 tsalo

@mriedel56 @62442katieb if possible, I'd love it if you could check out the new class (especially the _fit method, which does the actual CBP) and give your thoughts. So far, I just have the most basic elements of the algorithm implemented, so I still need input on (1) the clustering algorithm options, (2) the metrics to use, and (3) the outputs to save.

Ultimately, I want this class to be fairly basic, meaning not including too many tunable parameters, with some documentation pointing toward cbptools for users who require more control.

Additional questions:

  • Should we run PCA before clustering? From the sklearn clustering user guide:

    in very high-dimensional spaces, Euclidean distances tend to become inflated (this is an instance of the so-called “curse of dimensionality”). Running a dimensionality reduction algorithm such as Principal component analysis (PCA) prior to k-means clustering can alleviate this problem and speed up the computations.

  • Do we want to leverage sample weights at all? E.g., by weighting by studies' sample sizes?
  • How do we want to structure our outputs? The label maps can go in a standard MetaResult, but we have additional information, like filter selection ranges and metrics, that we probably want to output as well.

tsalo avatar Jun 30 '21 16:06 tsalo

Codecov Report

Base: 88.55% // Head: 84.29% // Decreases project coverage by -4.26% :warning:

Coverage data is based on head (0c60dd5) compared to base (e269941). Patch coverage: 7.89% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #533      +/-   ##
==========================================
- Coverage   88.55%   84.29%   -4.27%     
==========================================
  Files          38       36       -2     
  Lines        4370     4069     -301     
==========================================
- Hits         3870     3430     -440     
- Misses        500      639     +139     
Impacted Files Coverage Δ
nimare/parcellate.py 0.00% <0.00%> (ø)
nimare/dataset.py 90.33% <100.00%> (+0.37%) :arrow_up:
nimare/utils.py
nimare/base.py
nimare/__init__.py

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Jul 12 '21 22:07 codecov[bot]

https://github.com/neurosynth/neurosynth/blob/master/neurosynth/analysis/cluster.py

adelavega avatar Apr 20 '22 19:04 adelavega

@62442katieb has some code from her naturalistic meta-analysis that may implement some of these metrics: https://github.com/62442katieb/meta-analytic-kmeans/blob/daf3904caad990aeadc89bc98769aaed32857e09/evaluating_clustering_solutions.ipynb

tsalo avatar Apr 20 '22 19:04 tsalo