mia
mia copied to clipboard
merge and cluster: separate or unified functions for rows / cols?
We have now:
Clustering in a single function, with the MARGIN
argument:
-
cluster(... , MARGIN="features")
-
cluster(... , MARGIN="samples")
-> This function returns a \code{SummarizedExperiment} with clustering information in its colData or rowData
Merge in two different functions:
-
mergeFeatures(...)
-
mergeSamples(...)
-> This function returns a merged (Tree)SE object.
Would seem logical to unify the treatment, to have also merge()
implemented with the MARGIN
argument.
One issue with this is that the rows have the additional sequence data slot that the columns do not have. Therefore the treatment of rows (features) requires an extra step, and the problem is not entirely symmetric.
A wrapper can certainly deal with this but it highlights the more fundamental point, whether we like to enforce and maintain separate merge functions for samples and features. I do not see an immediate need for that as the merging procedure is near-identical, both rows and cols even have the (optional) tree information.
I agree with this. We could have merge(MARGIN = "features") function; inside the function MARGIN could specify whether to run .merge_features or .merge_samples internal functions
This can also be closed?
Yep