vegan icon indicating copy to clipboard operation
vegan copied to clipboard

likely error in CLR formula in decostand docs

Open handibles opened this issue 4 months ago • 5 comments

Dear Devs,

Thanks as always for the work. I was using decostand as a quick look up for the CLR formula, but think it's wrong in the vegan docs.

While the code for CLR (.calc_clr) is fine - mean of logs :

means <- rowMeans(clog)

The docs indicate that the formula here is $log(x) - log(u)$, where $x$ are the beasties and $u$ is the beastie mean, i.e. log of means.

Sadly, > mean(log(1:10)) == log(mean(1:10)) > [1] FALSE

Might save the next veganner from a mishap. All the best, CH

handibles avatar Feb 21 '24 13:02 handibles

@antagomir Have you looked at this?

jarioksa avatar Apr 24 '24 09:04 jarioksa

Woops, yes. This we should be able to solve asap. I will have a look today/tomorrow and open a PR. The broader rCLR improvements might take more time.

antagomir avatar Apr 24 '24 10:04 antagomir

I think the issue confused arithmetic and geometric mean, and the documentation is correct.

The documentation states that

clr = log(x/g(x)) = log x - log (g(x)), where g(x) denotes the geometric mean.

This is how CLR transformation is formally defined.

Log of geometric mean can be written as: log (g(x)) = log ((x1 * ... * xn)^(1/n)) = (1/n) * (log(x1) + ... + log(xn)) = mean(log(x))

Thus, log(g(x)) = mean(log(x)), where g(x) is geometric mean.

This is also seen with:

gm_mean = function(a){prod(a)^(1/length(a))}; log(gm_mean(1:10)) [1] 1.510441

mean(log(1:10)) [1] 1.510441

I am not sure whether the documentation could/should be improved since it already states that g(x) is the geometric mean, and this is how CLR definition is written in most sources afaik.

As far as I can see the documentation is correct and this issue could be closed unless there are further suggestions on how to improve.

antagomir avatar Apr 24 '24 16:04 antagomir

Thanks @antagomir. I've yet to find a solid mnemonic for the order in the CLR transform (hence the OP), but if I've got this correctly:

Definition of the CLR (as above): clr = log(x/g(x)) = log x - log(g(x)) i.e., clr = log(x/g(x)) = log x - mean(log(x))

Definition in the decostand documentation (latex'd formula): $clr = log(x) - log(u)$
, where u is the arithmetic mean :

Is that not incorrect?...

handibles avatar Apr 29 '24 12:04 handibles

Thanks @handibles - yes - that seems incorrect.

Looking at the current master branch of the man pages in github:vegandevs/vegan, lines 94-109 (clr documentation) refers to the geometric mean. So at least that manpage seems to be correct.

Where did you find that incorrect formula exactly - could you point me to the exact source? I will correct asap if we have a mistake anywhere. But right now I am not able to trace this..!

antagomir avatar May 01 '24 19:05 antagomir