phyloseq icon indicating copy to clipboard operation
phyloseq copied to clipboard

How to remove a specific ASV/OTU from specific samples within a Phyloseq object

Open mes1024 opened this issue 5 years ago • 2 comments

Hello,

I'm trying to figure out a way to remove ASVs/OTUs from a subset of samples within a phyloseq object. I have reason to believe that a certain ASV from one sample type has contaminated other samples which were isolated from the same environment. Meaning it's a real ASV in some samples and a contaminate in others.

I couldn't find another post that has addressed this specific issue but I did find a post detailing how to remove an ASV from all samples within a phyloseq object (issue #652) using the function below:

pop_taxa = function(physeq, badTaxa){ allTaxa = taxa_names(physeq) allTaxa <- allTaxa[!(allTaxa %in% badTaxa)] return(prune_taxa(allTaxa, physeq)) }

badTaxa = c("bad1", "bad2", "bad3") GP2 = pop_taxa(GlobalPatterns, badTaxa)

I'm curious if this function can be altered to target certain samples based on meta data or sample names. For example, if the ASV truly belongs to samples which are all "species1" and is a contaminate in samples which are all "species2" can I remove that ASV from only "species2" samples?

Thanks in advance for any recommendations on this issue! -Michael

mes1024 avatar Mar 07 '19 19:03 mes1024

An example of how you might do this.

library(phyloseq)
library(dplyr) # Just used for the pipe (%>%)

data(GlobalPatterns)
ps <- GlobalPatterns
otu <- otu_table(ps)
sam <- sample_data(ps)

# Just for sake of example, I'm taking the bad taxa to be the top 10 taxa with
# the most reads in the Fecal samples
bad_taxa <- ps %>%
    subset_samples(SampleType == "Feces") %>%
    taxa_sums %>%
    sort(decreasing = TRUE) %>%
    .[1:10] %>%
    names

# Suppose we want to remove some taxa from the ocean samples.
ocean_samples <- sample_names(ps)[sam$SampleType == "Ocean"]
# Now, set the bad_taxa in the ocean_samples to 0. Note, need to check if taxa are rows!
taxa_are_rows(ps)
#> [1] TRUE
otu[bad_taxa, ocean_samples] <- 0
otu_table(ps) <- otu
# Now they're 0
otu_table(ps)[bad_taxa, ocean_samples]
#> OTU Table:          [10 taxa and 3 samples]
#>                      taxa are rows
#>        NP2 NP3 NP5
#> 331820   0   0   0
#> 158660   0   0   0
#> 189047   0   0   0
#> 244304   0   0   0
#> 171551   0   0   0
#> 263681   0   0   0
#> 192573   0   0   0
#> 322235   0   0   0
#> 180658   0   0   0
#> 326977   0   0   0

mikemc avatar Mar 08 '19 22:03 mikemc

I only used the pipe %>% to quickly pick the example bad taxa. You could equally do

fecal_samples <- subset_samples(ps, SampleType == "Feces")
fecal_taxa_sums <- taxa_sums(fecal_samples)
fecal_taxa_sums <- sort(fecal_taxa_sums, decreasing = TRUE)
bad_taxa <- names(fecal_taxa_sums[1:10])

mikemc avatar Mar 08 '19 22:03 mikemc