TrimGalore icon indicating copy to clipboard operation
TrimGalore copied to clipboard

trimgalore and cutadapt in two steps?

Open marwa38 opened this issue 1 year ago • 6 comments

Hi team I know that trimgalore is a wrapper around cutadapt but I want to check what I ran is considered fine I ran trimgalore to remove adaptors and primers at 5' and 3' ends but then

trim_galore -q 30 --phred33 --paired --gzip --fastqc --fastqc_args "--nogroup" -o ./trimGalored --length 20 --stringency 1 -e 0 --clip_R1 17 --clip_R2 21 *_S"$i"_L001_R1_001.fastq.gz *_S"$i"_L001_R2_001.fastq.gz

I found that I need to check for primers within the sequences themselves and I found and I removed them Things are ok that I did my analysis in that way? I don't want to reran the analysis again but I want to make sure that this is fine.

# Place filtered files in filtN/ subdirectory
fnFs.filtN <- file.path(path, "filtN", basename(fnFs)) # Put N-filterd files in filtN/ subdirectory
fnRs.filtN <- file.path(path, "filtN", basename(fnRs))
filterAndTrim(fnFs, fnFs.filtN, fnRs, fnRs.filtN, maxN = 0, multithread = TRUE)

#Check if primers are still inside (N, W and H nucleotides are identified by cutadapt)
FWD <- "CCTACGGGNGGCWGCAG"  
REV <- "GACTACHVGGGTATCTAATCC"

allOrients <- function(primer) {
  # Create all orientations of the input sequence
  require(Biostrings)
  dna <- DNAString(primer)  # The Biostrings works w/ DNAString objects rather than character vectors
  orients <- c(Forward = dna, Complement = complement(dna), Reverse = reverse(dna), 
               RevComp = reverseComplement(dna))
  return(sapply(orients, toString))  # Convert back to character vector
}
FWD.orients <- allOrients(FWD)
REV.orients <- allOrients(REV)
FWD.orients

# now ready to count the number of times the primers appear in the forward and reverse read, while considering all possible primer orientations. 
#Identifying and counting the primers on one set of paired end FASTQ files is sufficient, assuming all the files were created using the same library preparation, so we’ll just process the first sample.

primerHits <- function(primer, fn) {
  # Counts number of reads in which the primer is found
  nhits <- vcountPattern(primer, sread(readFastq(fn)), fixed = FALSE)
  return(sum(nhits > 0))
}
library (ShortRead)
rbind(FWD.ForwardReads = sapply(FWD.orients, primerHits, fn = fnFs.filtN[[1]]), 
      FWD.ReverseReads = sapply(FWD.orients, primerHits, fn = fnRs.filtN[[1]]), 
      REV.ForwardReads = sapply(REV.orients, primerHits, fn = fnFs.filtN[[1]]), 
      REV.ReverseReads = sapply(REV.orients, primerHits, fn = fnRs.filtN[[1]]))
#                   Forward Complement Reverse RevComp
# FWD.ForwardReads       0          0       0       0
# FWD.ReverseReads       0          0       0     112
# REV.ForwardReads       0          0       0       8
# REV.ReverseReads      99          0       0       0

marwa38 avatar Nov 21 '22 12:11 marwa38