micom icon indicating copy to clipboard operation
micom copied to clipboard

[feature] Transition the phenotype associations to non-parametric tests

Open cdiener opened this issue 10 months ago • 0 comments

Purpose

The former strategy of identifying metabolite:phenotype associations with coefficients from a LASSO models has proven to be quite unstable due to (a) the non-uniqueness of the coefficients and (b) instability across scikit-learn versions and initialization. This PR switches this to a more stringent approach that uses Mann-Whitney U or Spearman rho tests to assess metabolite-level associations. A LASSO models is still fit to assess the overall/global association with the phenotype.

Visualization

The visualization is similar but now shows a confusion matrix for binary outcomes. A quantitative effect measure will be used instead of coefficients.

example visualization

Side effects

This adds new example data sets to help test and document the new functionality.

TODO

  • [ ] update tests
  • [X] update docs

cdiener avatar Apr 09 '24 13:04 cdiener