mia
mia copied to clipboard
Absolute vs. relative abundances
Some microbiome studies are starting to report measures of absolute abundance in addition to sequencing read counts and relative abundances. The MicrobiomeExperiment
could/should support incorporating this information, too.
Can you elaborate on this? Maybe an example you have seen?
For instance Vandeputte et al. (2017) Fig. 2 provides an early example of absolute vs. relative quantification.
Hmm from that figure I guess it is just a factor from the colData
which gets applied to scaling of data, doesn't it?
So before melting the data for plotting, a simple multiplication of the assay data with this factor of the colData
would convert the relative abundance to an absolute abundance, which is then plotted.
Can you repost the issue on mia, since this is about melting the data for plotting, and close this issue? Thanks
This probably should get combined with FelixErnst/MicrobiomeExperiment#18
I am not sure. Isn't including absolute abundance information rather a class-level definition issue rather than aggregation functionality? I was thinking of a dedicated slot that has information on the absolute abundances, even if the data object itself has been transformed into e.g. relative abundances.
From my point of view, this is a bit like a library size factor. In the paper they described this as " downsized to an even sampling depth, defined as the ratio between sample size and and microbial load (average total cell count per gram)".
So this ration could be stored a colData
as it is the case for sizeFactors
(See ?sizeFactors
and getMethod("sizeFactors","SingleCellExperiment")
). Analogous a getter and setter for this type of ratio is probably best. The data is then stored in colData
(no extra slot) and can be retrieved easily.
For plotting this factor has to be applied before melting and thus should get combined with FelixErnst/MicrobiomeExperiment#18.
I can create a PR for this in MicrobiomeExperiment
. The use of the factor for plotting will probably end up in mia
.
Ok sounds good.
See FelixErnst/MicrobiomeExperiment#19 for getter/setter added
Just a really quick plug advocating against the idea of rescaling relative abundances (obtained from shotgun or amplicon) by absolute abundances. The reasons for the strong no are here (https://elifesciences.org/articles/46923) and an alternative is here (https://www.biorxiv.org/content/10.1101/761486v1).
Great, thanks @adw96 - my intention was mainly to prepare for future developments in measurement technologies (as this class may be potentially long-living), rather than use this for the currently available measurement techniques. Fully agree with the challenges.
@adw96 wouldn't the addition of this type of sample data allow the variable efficiency estimate to be computed based on the appropriate count matrix?
It would probably quite straight forward to write an interface for paramedic, if my assumption is correct.
edit: OK I see, that the sample concentration paramedic
needs is also two dimensional. So that doesn't work out of the box with these changes.