mia icon indicating copy to clipboard operation
mia copied to clipboard

Absolute vs. relative abundances

Open antagomir opened this issue 4 years ago • 11 comments

Some microbiome studies are starting to report measures of absolute abundance in addition to sequencing read counts and relative abundances. The MicrobiomeExperiment could/should support incorporating this information, too.

antagomir avatar Oct 17 '20 21:10 antagomir

Can you elaborate on this? Maybe an example you have seen?

FelixErnst avatar Oct 19 '20 11:10 FelixErnst

For instance Vandeputte et al. (2017) Fig. 2 provides an early example of absolute vs. relative quantification.

antagomir avatar Oct 19 '20 12:10 antagomir

Hmm from that figure I guess it is just a factor from the colData which gets applied to scaling of data, doesn't it?

So before melting the data for plotting, a simple multiplication of the assay data with this factor of the colData would convert the relative abundance to an absolute abundance, which is then plotted.

Can you repost the issue on mia, since this is about melting the data for plotting, and close this issue? Thanks

FelixErnst avatar Oct 20 '20 07:10 FelixErnst

This probably should get combined with FelixErnst/MicrobiomeExperiment#18

FelixErnst avatar Oct 20 '20 08:10 FelixErnst

I am not sure. Isn't including absolute abundance information rather a class-level definition issue rather than aggregation functionality? I was thinking of a dedicated slot that has information on the absolute abundances, even if the data object itself has been transformed into e.g. relative abundances.

antagomir avatar Oct 20 '20 08:10 antagomir

From my point of view, this is a bit like a library size factor. In the paper they described this as " downsized to an even sampling depth, defined as the ratio between sample size and and microbial load (average total cell count per gram)".

So this ration could be stored a colData as it is the case for sizeFactors (See ?sizeFactors and getMethod("sizeFactors","SingleCellExperiment")). Analogous a getter and setter for this type of ratio is probably best. The data is then stored in colData (no extra slot) and can be retrieved easily.

For plotting this factor has to be applied before melting and thus should get combined with FelixErnst/MicrobiomeExperiment#18.

I can create a PR for this in MicrobiomeExperiment. The use of the factor for plotting will probably end up in mia.

FelixErnst avatar Oct 20 '20 09:10 FelixErnst

Ok sounds good.

antagomir avatar Oct 20 '20 09:10 antagomir

See FelixErnst/MicrobiomeExperiment#19 for getter/setter added

FelixErnst avatar Oct 20 '20 15:10 FelixErnst

Just a really quick plug advocating against the idea of rescaling relative abundances (obtained from shotgun or amplicon) by absolute abundances. The reasons for the strong no are here (https://elifesciences.org/articles/46923) and an alternative is here (https://www.biorxiv.org/content/10.1101/761486v1).

adw96 avatar Oct 20 '20 15:10 adw96

Great, thanks @adw96 - my intention was mainly to prepare for future developments in measurement technologies (as this class may be potentially long-living), rather than use this for the currently available measurement techniques. Fully agree with the challenges.

antagomir avatar Oct 20 '20 16:10 antagomir

@adw96 wouldn't the addition of this type of sample data allow the variable efficiency estimate to be computed based on the appropriate count matrix?

It would probably quite straight forward to write an interface for paramedic, if my assumption is correct.

edit: OK I see, that the sample concentration paramedic needs is also two dimensional. So that doesn't work out of the box with these changes.

FelixErnst avatar Oct 20 '20 16:10 FelixErnst