seq_crumbs icon indicating copy to clipboard operation
seq_crumbs copied to clipboard

sff_extract to support amplicon reads

Open fangly opened this issue 10 years ago • 3 comments

When using sff_extract (seq_crumbs 0.1.8) on SFF files that contain amplicon reads, I get the warning:


WARNING: weird sequences in file /srv/whitlam/bio/data/pyrotags/raw/Gasket67/Gasket67.sff After applying left clips too many reads start with: A This does not look sane. [...]


In my case, since the reads are not shotgun but amplicon, I do expect many reads to start with the same nucleotide. Would it be possible to add a flag called --amplicon to inform sff_extract that the input contains amplicon sequences and to not display this warning?

Thanks,

Florent

fangly avatar Nov 11 '13 03:11 fangly

I now realize that the --max_percentage does exactly this, though I did not understand its meaning when initially reading the help page.

I suggest that you explain exactly what this option does in the help page, and mention that this is judicious to use -- max_percentage 100 when processing SFF files containing amplicon reads.

Best,

Florent

fangly avatar Nov 11 '13 04:11 fangly

You right, we should write a manual.

JoseBlanca avatar Nov 11 '13 08:11 JoseBlanca

@fangly: Sorry about the poor explanation. I wrote that option myself and submitted it to sff_extract. It was meant to be used with a lower value than the default 50% in shotgun reads. I could never find a good way to explain what it does to someone who never used sff_extract before... But if you have a better suggestion on how to explain the option - I really would like to hear it, because I just can't seem to come up with a better one...

StuntsPT avatar Nov 14 '13 16:11 StuntsPT