mia icon indicating copy to clipboard operation
mia copied to clipboard

Agglomeration methods

Open antagomir opened this issue 1 year ago • 0 comments

Consider feature / sample agglomeration methods as useful dimension reduction methods.

First issue to consider is whether it is necessary to have separate functions for Features vs. Samples, or could there by just one merge function with a margin argument. In the latter case we could drop "Features" out from the function names.

These are closely related to:

  • mergeFeatures
  • mergeFeaturesByRank

Then we could have added:

  • mergeFeaturesByCluster (based on cluster() function, based on co-abundances)
  • mergeFeaturesBySimilarity (similar to speedyseq::tip_glom; agglomerate tree leaves that have a small (user-defined) distance in the tree; also based on hierarhical clustering / co-abundance?; to check how this would differ from byCluster and byTree..)
  • mergeFeaturesByTree (similar to speedyseq::tree_glom; agglomerate tree leaves that have a small (user-defined) distance in the tree; based on the tree only)

See https://rdrr.io/github/mikemc/speedyseq/man/tree_glom.html

Naming could follow the one suggested in #392 . Note that the TreeSummarizedExperiment package sometimes refers to the finest level features as (tree) leaves. Consider the naming on the same go.

antagomir avatar Jul 24 '23 22:07 antagomir