micom
micom copied to clipboard
[feature] Transition the phenotype associations to non-parametric tests
Purpose
The former strategy of identifying metabolite:phenotype associations with coefficients from a LASSO models has proven to be quite unstable due to (a) the non-uniqueness of the coefficients and (b) instability across scikit-learn versions and initialization. This PR switches this to a more stringent approach that uses Mann-Whitney U or Spearman rho tests to assess metabolite-level associations. A LASSO models is still fit to assess the overall/global association with the phenotype.
Visualization
The visualization is similar but now shows a confusion matrix for binary outcomes. A quantitative effect measure will be used instead of coefficients.
Side effects
This adds new example data sets to help test and document the new functionality.
TODO
- [ ] update tests
- [X] update docs