bedtools icon indicating copy to clipboard operation
bedtools copied to clipboard

bedSort fails for 0 length features

Open bernt-matthias opened this issue 6 years ago • 1 comments

bedSort outputs the following for the SNPs dataset from UCSC

...
chr22	17586594	17586595	rs34484815	0	+
chr22	17586605	17586605	rs536619616	0	+
chr22	17586604	17586605	rs560126106	0	+
...

I guess the problem are 0 length features which do not make sense. But bedtools should still output sorted data.

bernt-matthias avatar Sep 27 '17 09:09 bernt-matthias

The note from UCSC on the validity of 0 length SNPs:

We consider point insertions into the genome to be zero length features. You can see the SNP in question in the following Genome Browser view: http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=chmalee&hgS_otherUserSessionName=hg19_chr22PointInsertion

where the highlighted SNP indicates a G or GG insertion between bases 17586605 and 17586606 on chromosome 22. Because we internally store our coordinates as zero-based half open coordinates, these point insertions end up as zero length coordinates. For more information on our coordinate system please see the following blog post: http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/

bernt-matthias avatar Sep 27 '17 21:09 bernt-matthias