Ktrim icon indicating copy to clipboard operation
Ktrim copied to clipboard

Ktrim doesn't work for Ribo-seq fastq files

Open Ci-TJ opened this issue 5 years ago • 2 comments

Hello, I just used the Ktrim to trim Ribo-seq reads, but i found it did not work. then, I tried galore_trim and galore_trim could trim residual adaptor from the reads.

`[ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup ktrim -U ../fastq/SRR6869758.sra.fastq -s 20  -o SRR58 &
[2] 4997
 [ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup: ignoring input and appending output to ‘nohup.out
(cutadaptenv)  [ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup trim_galore --length 20 -o galore/ ../fastq/SRR6869758.sra.fastq &
[2] 14323
(cutadaptenv)  [ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup: ignoring input and appending output to ‘nohup.out’
`

I have sent QC reports of fastqc to your email. Best, Ci

Ci-TJ avatar Jul 16 '20 07:07 Ci-TJ

Hi Ci,

Thanks for your interest in my work. I believe that the issue is the adapter sequence used.

By default, Ktrim uses the illumina TruSeq Kit adapter AGATCGGAAGAGC, while in your

data it is generated using illumina Small RNA-seq, which uses another adapter sequence

“TGGAATTCTCGGGTGCCAAGG” (you can get this sequence from illumina’s document at

https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf). Trim_galore works because it is a wrapper and it will

automatically detect adapters from the top 10K reads in the data, then send the adapter

sequence to cut_adapt. So, to use Ktrim in your data, what you should change is to add

“-a TGGAATTCTCGGGTGCCAAGG” option to specify the adapter.

Regards,

Kun

发件人: Qin Lin [email protected] 发送时间: 2020年7月16日 15:30 收件人: hellosunking/Ktrim [email protected] 抄送: Subscribed [email protected] 主题: [hellosunking/Ktrim] Ktrim doesn't work for Ribo-seq fastq files (#4)

Hello, I just use the Ktrim to trim Ribo-seq reads, but i found it did not work. then, I tried galore_trim and galore_trim could trim residual adaptor from the reads.

`[ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup ktrim -U ../fastq/SRR6869758.sra.fastq -s 20 -o SRR58 &

[2] 4997

[ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup: ignoring input and appending output to ‘nohup.out

(cutadaptenv) [ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup trim_galore --length 20 -o galore/ ../fastq/SRR6869758.sra.fastq &

[2] 14323

(cutadaptenv) [ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup: ignoring input and appending output to ‘nohup.out’

`

I have send QC reports of fastqc to your email. Best, Ci

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hellosunking/Ktrim/issues/4 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AHEOH5TZXADK5VY6ONTCQJLR32T7HANCNFSM4O3XFS2Q .

hellosunking avatar Jul 16 '20 08:07 hellosunking

Hi. I have tried your advice, it work but not well. Ribo-seq insert reads are almost 27~32nt, and I still get many reads are 50nt after trimming by Ktrim. That maybe resulted from the adapter sequence, I found the adapter sequence TGGAATTCTCGG was used by trim_galore. Then I tried TruSeq Ribo Profile adapter sequence "AGATCGGAAGAGCACACGTCT", it was bad.

[ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup ktrim -U ../fastq/SRR6869758.sra.fastq -o 2SRR58 -a  TGGAATTCTCGGGTGCCAAGG &
[1] 29995
 [ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup: ignoring input and appending output to ‘nohup.out’

###try TruSeq Ribo Profile adapter sequence "AGATCGGAAGAGCACACGTCT"
[ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup ktrim -U ../fastq/SRR6869758.sra.fastq -o 3SRR58 -a  AGATCGGAAGAGCACACGTCT &
[1] 30366
 [ymwang @ ~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim]$ nohup: ignoring input and appending output to ‘nohup.out’

nohup fastqc -t 18 -o ./ 2SRR58.read1.fq &
nohup fastqc -t 18 -o ./ 3SRR58.read1.fq &

~/linqin/RNA-seq/Analysis/GSE123611/out_Ktrim/galore]$ head -n 20 SRR6869758.sra.fastq_trimming_report.txt

SUMMARISING RUN PARAMETERS
==========================
Input filename: ../fastq/SRR6869758.sra.fastq
Trimming mode: single-end
Trim Galore version: 0.5.0
Cutadapt version: 2.10
Quality Phred score cutoff: 20
Quality encoding type selected: ASCII+33
Adapter sequence: 'TGGAATTCTCGG' (Illumina small RNA adapter; auto-detected)
Maximum trimming error rate: 0.1 (default)
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length before a sequence gets removed: 20 bp


This is cutadapt 2.10 with Python 3.7.6
Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a TGGAATTCTCGG ../fastq/SRR6869758.sra.fastq
WARNING: Option --format is deprecated and ignored because the input file format is always auto-detected
Processing reads on 1 core in single-end mode ...
Finished in 724.54 s (28 us/read; 2.14 M reads/minute).

Will Ktrim add the function that could auto-detect adapter sequcence in future?

Best, Ci

Ci-TJ avatar Jul 19 '20 05:07 Ci-TJ