scythe icon indicating copy to clipboard operation
scythe copied to clipboard

Weird behavior

Open dgpinheiro opened this issue 11 years ago • 6 comments

Hi,

We are using scythe to trim 3' adapter but we found a very weird behavior using this sequence (in.fq):

@014_1000001169_x1 AAAAAAGATGCCAGTTGAAGAACTGATGGAATTCTCGGGTGCCAAAGAACTAAAG +014_1000001169_x1 BBBB>>1111B1B1BBBBF1BF1BB1B11BBBBAD3A00A0BBDB00BB0D1AB1

and adapter fasta file (adapt.fa):

RPI10 TGGAATTCTCGGGTGCCAAGGAACTCCAGTCACTAGCTTATCTCGTATGCCGTCTTCTGCTTG

The result of this command : scythe -o out.fq -m match.txt -a adapt.fa in.fq

is this fastq file (out.fq):

@014_1000001169_x1 N + B

Why the scythe trims all the read ??? The match file (match.txt) content is :

p(c|s): 1.000000; p(!c|s): 0.000000; adapter: RPI10 014_1000001169_x1 TGGAATTCTCGGGTGCCAAGGAACTCCAG ||||||||||||||||||| ||||| || TGGAATTCTCGGGTGCCAAAGAACTAAAG B11BBBBAD3A00A0BBDB00BB0D1AB1 [1.00, 0.97, 0.97, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 0.98, 1.00, 0.97, 0.97, 1.00, 0.97, 1.00, 1.00, 1.00, 1.00, 0.97, 0.97, 1.00, 1.00, 0.97, 1.00, 0.97, 1.00, 1.00, 0.97]

So it should trim only the 3' region like this (according to the match region):

@014_1000001169_x1 AAAAAAGATGCCAGTTGAAGAACTGA +014_1000001169_x1 BBBB>>1111B1B1BBBBF1BF1BB1

dgpinheiro avatar Oct 28 '14 00:10 dgpinheiro

Hi Daniel,

Thanks for reporting this — I'll take a look. I think you had emailed me regarding this earlier; my sincerest apologies for my delayed response (I've been slammed with work lately!). I'll try to take a look at this issue this week.

vsbuffalo avatar Oct 28 '14 05:10 vsbuffalo

Daniel,

You probably want to reduce the -M flag. The correctly trimmed fragment you show above is of length 26, and scythe by default only keeps reads longer than 30 bp. If you want to keep all, use -M 1.

for me, scythe -o out.fq -m match.txt -M 1 -a adapt.fa in.fq gives the following for out.fq:

@014_1000001169_x1 
AAAAAAGATGCCAGTTGAAGAACTGA
+
BBBB>>1111B1B1BBBBF1BF1BB1

Hope that helps,

Cheers, Kevin

kdm9 avatar Oct 28 '14 06:10 kdm9

My bad, make that 35bp by default :smile:

kdm9 avatar Oct 28 '14 06:10 kdm9

Hi Vincent,

We have used scythe in our analysis but from this event I am concerned about using it. Hopefully you can help us.

Thanks,

Daniel

2014-10-28 3:50 GMT-02:00 Vince Buffalo [email protected]:

Hi Daniel,

Thanks for reporting this — I'll take a look. I think you had emailed me regarding this earlier; my sincerest apologies for my delayed response (I've been slammed with work lately!). I'll try to take a look at this issue this week.

— Reply to this email directly or view it on GitHub https://github.com/vsbuffalo/scythe/issues/25#issuecomment-60712794.

Daniel Guariz Pinheiro Professor Assistente Doutor (FCAV/Unesp)

dgpinheiro avatar Oct 28 '14 17:10 dgpinheiro

Daniel,

Please see @kdmurray91's comments — he is correct, this is not a bug. Your match is 29 bases long, -M is 35bp.

vsbuffalo avatar Oct 28 '14 18:10 vsbuffalo

Oh...

Ok... I see the @kdmurray91 https://github.com/kdmurray91's comments. Now, I understand, the "-M" option causes the reduction of this sequence to only one base. Sorry, I had thought that these kind of sequence (using -M option) would be removed from the output file.

Thanks,

Daniel

2014-10-28 16:00 GMT-02:00 Vince Buffalo [email protected]:

Daniel,

Please see @kdmurray91 https://github.com/kdmurray91's comments — he is correct, this is not a bug. Your match is 29 bases long, -M is 35bp.

— Reply to this email directly or view it on GitHub https://github.com/vsbuffalo/scythe/issues/25#issuecomment-60801818.

Daniel Guariz Pinheiro Professor Assistente Doutor (FCAV/Unesp)

dgpinheiro avatar Oct 28 '14 18:10 dgpinheiro