AstrobioMike.github.io icon indicating copy to clipboard operation
AstrobioMike.github.io copied to clipboard

decontam bug in subsetting fasta file when there are no contaminants

Open MADscientist314 opened this issue 2 years ago • 1 comments

Hey @AstrobioMike hope you are well. I ran across a situation in your full amplicon example that breaks the tutorial. Not sure if its worth mentioning but in the event that decontam doesn't reveal any contaminant sequences from the negative controls, subsetting the fasta file breaks. Here is an example of what I'm talking about. Sorry I don't have an elegant solution.

> vector_for_decontam <- c(rep(FALSE, 33), rep(TRUE, 3))
> tail(rownames(t(asv_tab)))
[1] "NORMAL-11b" "GNOTO-12b" "GNOTO-13b"  "KITNEG-KN1"
[5] "KITNEG-KN2" "KITNEG-KN3"
> contam_df <- isContaminant(t(asv_tab), neg=vector_for_decontam)
> table(contam_df$contaminant) # identified no contaminants
FALSE 
  585 
> unique(contam_df$contaminant)
[1] FALSE
> # getting vector holding the identified contaminant IDs
> contam_asvs <- row.names(contam_df[contam_df$contaminant == TRUE, ])
> contam_asvs
character(0)
> contam_asvs
character(0)
> asv_tax[row.names(asv_tax) %in% contam_asvs, ]
     domain phylum class order family genus species
> # making new fasta file
> contam_indices <- which(asv_fasta %in% paste0(">", contam_asvs))
> contam_indices
integer(0)
> dont_want <- sort(c(contam_indices, contam_indices + 1))
> print(dont_want)
numeric(0)
> asv_fasta_no_contam <- asv_fasta[-dont_want]
> asv_fasta_no_contam
character(0) #UH OHHHH

MADscientist314 avatar Apr 05 '22 15:04 MADscientist314

Ah of course! I’ll add in a step to check and notes about skipping the sub-setting stuff to be more clear. Thanks for the note, Michael! Hope all is well in your world too :)

AstrobioMike avatar Apr 05 '22 17:04 AstrobioMike