gemelli icon indicating copy to clipboard operation
gemelli copied to clipboard

Are species scores appropriate to calculate when using RPCA using vegan's dbRDA+sppscores

Open mestaki opened this issue 1 year ago • 0 comments

Hey @cameronmartino,

In the past I've used QIIME 2 to create the species/sites biplot using gemelli's phylo-/RPCA. This biplot is of course unconstrained by env variables, which leaves much to desire, so I've been playing with importing these into R to work with RDA.

The choices are to either import the rclr transformed table (actually you can do this transform directly in vegan now) and use vegan::rda() or import the distance matrix and use dbrda. I'm inclined to use the latter approach as I'm guessing that the additional matrix completion layer gemelli adds does improve the signal. Is this first assumption right?

If so, then the 2nd challenge is that unlike the capscale function dbrda doesn't calculate species scores, you have to add those after using sppscores. There was a nice discussion about why this is the case here with regards to the limitations of calculating these species scores. One line worth highlighting here wrt to the scores:

It is strictly correct only with Euclidean distances and can be misleading with other distances (even metric ones)

From what I gather the phylo-/RPCA would be considered Euclidean and so ok to calculate species scores with. Is my speculation correct here, meaning we can technically get an RPCA triplot! or, is there something I'm missing here?

Thanks!

mestaki avatar Jul 12 '22 08:07 mestaki