mia icon indicating copy to clipboard operation
mia copied to clipboard

merge and cluster: separate or unified functions for rows / cols?

Open antagomir opened this issue 1 year ago • 1 comments

We have now:

Clustering in a single function, with the MARGIN argument:

  • cluster(... , MARGIN="features")
  • cluster(... , MARGIN="samples")

-> This function returns a \code{SummarizedExperiment} with clustering information in its colData or rowData

Merge in two different functions:

  • mergeFeatures(...)
  • mergeSamples(...)

-> This function returns a merged (Tree)SE object.

Would seem logical to unify the treatment, to have also merge() implemented with the MARGIN argument.

One issue with this is that the rows have the additional sequence data slot that the columns do not have. Therefore the treatment of rows (features) requires an extra step, and the problem is not entirely symmetric.

A wrapper can certainly deal with this but it highlights the more fundamental point, whether we like to enforce and maintain separate merge functions for samples and features. I do not see an immediate need for that as the merging procedure is near-identical, both rows and cols even have the (optional) tree information.

antagomir avatar Jul 25 '23 07:07 antagomir