propr icon indicating copy to clipboard operation
propr copied to clipboard

Question about implementing ALR with ERCC92 spike ins

Open jolespin opened this issue 3 years ago • 0 comments

I finally got my hands on a dataset with properly designed ERCC92 spike ins. The question is, how should I use these with ALR in theory?

The additive log-ratio transformation (alr), which allows the user to scale their data by a feature with an a priori known fixed abundance, such as a house-keeping gene or an experimentally fixed variable (e.g., a ThermoFisher ERCC synthetic RNA “spike-in”15), may provide a superior alternative. In contrast to clr, proportionality calculated with alr does not change with missing feature data because it effectively back-calculates the absolute feature abundance.

https://www.nature.com/articles/s41598-017-16520-0

Do I use a single ERCC92 feature as the reference, the summation, or the mean?

Do I include all or only a select few if it's the latter 2 options?

Should I scale all the datasets so their ERCC92 spike counts are the same before transformation? (This will likely result in the same data, though I'm thinking out loud and haven't tested)

jolespin avatar Jan 17 '22 16:01 jolespin