ComputationalGenomicsManual
ComputationalGenomicsManual copied to clipboard
Filtering host reads
Hello, I was attempting the following codes as you described to filter out host sequences:.
"host sequences: mkdir host not_host samtools fastq -F 3588 -f 65 output.bam | gzip -c > host/output_S_R1.fastq.gz echo "R2 matching host genome:" samtools fastq -F 3588 -f 129 output.bam | gzip -c > host/output_S_R2.fastq.gz
sequences that are not host: samtools fastq -F 3584 -f 77 output.bam | gzip -c > not_host/output_S_R1.fastq.gz samtools fastq -F 3584 -f 141 output.bam | gzip -c > not_host/output_S_R2.fastq.gz samtools fastq -f 4 -F 1 output.bam | gzip -c > not_host/output_S_Singletons.fastq.gz"
I am new to samtools and do not understand the -F and -f flags as well as the integers that follow them. Do these determine which sequences are host and non-host?