propr
propr copied to clipboard
Error: posthoc reports finding negative measurements where there is none
Heya, Thom
I've been using your package for a while and recently tried the newly implemented posthoc function, but it threw me an error:
Error in if (any(counts < 0)) stop("Data may not contain negative measurements.") : missing value where TRUE/FALSE is needed
The raw data table used in the analysis is too large to post here, but I believe the procedure is straightforward. The data table is an OTU table of counts, with no negative, infinite or NA values, and only numeric columns. The following snippet describes my pipeline: otus.cmR <- zCompositions::cmultRepl(otus) # stores counts as proportions otus.propd <- propd(otus.cmR, group = map$diag_zone) # diag zone is a vector of character strings indicating the diagenetic zone of each sample, with six different states. otus.propd <- setActive(otus.propd) otus.propd <- updateCutoffs.propd(otus.propd, cutoff = c(0.05, 0.95, 0.3)) otus.propd <- posthoc(otus.propd) # This is where the error arises.
I checked the output of the cmultRepl function, and there are neither negative nor invalid values; I guess the successful execution of the code lines above the post hoc commant all testify to this.
Best, Tor Einar
Thanks Tor. I have a few questions...
-
Could you run:
updateF(otus.propd)
and let me know if this reproduces the error? -
Is map$diag_zone a factor or a character? (I don't know why one would work vs. the other, but make sure it is a character so I can rule out that as the cause)
-
Could you confirm that the following snippet works?
data(iris)
x <- iris[,1:4]
y <- iris[,5]
library(propr)
otus.propd <- propd(x, group = y)
otus.propd <- setActive(otus.propd)
otus.propd <- updateCutoffs.propd(otus.propd, cutoff = c(0.05, 0.95, 0.3))
otus.propd <- posthoc(otus.propd)
If not, update with devtools::install_github("tpq/propr")
-
updateF(otus.propd)
works. -
class(map$diag_zone)
gives "character". - The snippet works.
One thing I forgot to add in the initial post, which I believe is insignificant, is that the propd object that's created is stored as a list element as follows:
cmp$propd <- propd(cmp$otus, group = cmp$map$diag_zone)
I'm sorry if leaving this information out is what has caused the trouble.
Edit: I updated the package with devtools::install_github("tpq/propr")
. Previously, the function message on ANOVA being run appeared before the error, but this is not the case anymore. The error persists.
I admit I'm a bit stumped without de-bugging the data first hand. Do you feel comfortable sharing the OTU and labels (as separate CSVs) via private message?
I did another check of the mapping file that went into propd. I see that it contains some NA
s. I don't know if that could help by pointing towards possible NA
occurrences in the resulting propd
object, for instance if samples labelled with NA
are somehow treated as a group or if the mapping vector is coerced into character for the analysis but saved in its initial state beforehand, and this initial state is then re-used in posthoc
? Just thinking out loud. Nevertheless, it seems weird to me that this would pass through all the other functions without problem.
I'll send you the current data via private message.
I did an additional check for my data and saw that all theta
values were 1. Weird. I tried rerunning the initial snippet, except for the updateCutoffs.propd
function, with the updated mapping vector, now without NA
values, and it's currently not throwing me any errors where it used to. I'll update this thread again when posthoc
has finished running - i don't know how long that's expected to take for my data.
In summary, it seems the mistake is on me for feeding propd NA
s in the group
vector. I'd suggest letting the propd
function throw a warning or error in the future if this turns out to be is the case, though.
Update: It worked without error.
posthoc() does do something different than all other functions... it tries to parse the group vector into a set of pairwise contrasts. Something about the NAs must cause this to fail loudly, while the NAs might cause the other methods to fail silently? I'll be sure to add an explicit check in the next version!!
(By the way, when working with propd
, you can use updateF
to get exact p-values. This is much, much faster than the updateCutoffs
approach.)