decontam prevalence method figure; Prevalence (Negative Controls) on x-axis against Prevalence (True Samples) on y-axis?

prevalence method figure; Prevalence (Negative Controls) on x-axis against Prevalence (True Samples) on y-axis?

Open marwa38 opened this issue 2 years ago • 1 comments

hello .. I went for the prevalence method; my plot using this command is different from the one in the tutorial (having just few as you can in the below-attached figures) could you please guide me into this?

contamdf.prev05 <- isContaminant(ps.decon, method="prevalence", neg="is.neg", threshold=0.5)
table(contamdf.prev05$contaminant)
# FALSE  TRUE 
# 2029    16

# Make phyloseq object of presence-absence in negative controls and true samples
ps.pa <- transform_sample_counts(ps.decon, function(abund) 1*(abund>0))
ps.pa.neg <- prune_samples(sample_data(ps.pa)$phase == "negative", ps.pa)
ps.pa.pos <- prune_samples(sample_data(ps.pa)$phase == "positive", ps.pa)
# Make data.frame of prevalence in positive and negative samples
df.pa <- data.frame(pa.pos=taxa_sums(ps.pa.pos), pa.neg=taxa_sums(ps.pa.neg),
                      contaminant=contamdf.prev05$contaminant)

ggplot(data=df.pa, aes(x=pa.neg, y=pa.pos, color=contaminant)) + geom_point() +
  xlab("Prevalence (Negative Controls)") + ylab("Prevalence (True Samples)")

df.pa.zip infosession: decontam_1.14.0

in the tutorial

mine

many thanks

Feb 27 '22 10:02 marwa38

I'm not sure what your question is?

Mar 01 '22 23:03 benjjneb

ops .. sorry I missed your answer.. my figure didn't show a similar pattern as in decontam tutorial (2 figures attached up in the post). I am not sure I got what the figure want to say what do you think? Could you please comment Many thanks @benjjneb

Nov 03 '22 11:11 marwa38

Your figure shows such a small number of samples (3 negative controls, 2 real samples) that I don't think that decontam is even a useful tool. You'll need to develop some sort of ad hoc approach to removing contaminants (e.g. removing everythign that appears in >2 negative controls), or return to this when you have your full dataset.

Nov 03 '22 14:11 benjjneb

thanks for your reply @benjjneb what do you mean with real samples? my actual samples or samples that or not contaminated? here is sample number

Nov 03 '22 14:11 marwa38

Your previous figure shows a maximum "Prevalence (True Samples)" of 2, and a maximum "Prevalence (Negative Controls)" of 3. So, either that figure is plotted incorrectly, or there is nearly no overlap between the taxa found in various samples.

Nov 03 '22 14:11 benjjneb

I double-checked and yeah I chose for ps.pa.pos the positive controls and not the true samples that is why do you think it is a good practice to include the positives as true samples or better not to consider? but at the end I got the same only three taxa were removed

Nov 03 '22 19:11 marwa38

From that plot, it looks like almost none of your taxa are also found in your negative controls. So, that seems good.

do you think it is a good practice to include the positives as true samples or better not to consider?

I think it is fine to have the positive samples included with the true samples for the purpose of "prevalence" testing.

Nov 03 '22 23:11 benjjneb

decontam decontam copied to clipboard

prevalence method figure; Prevalence (Negative Controls) on x-axis against Prevalence (True Samples) on y-axis?

decontam
decontam copied to clipboard