abra
abra copied to clipboard
Bug in BED file parsing
Hi Lisle,
I found a little bug in your BED file parsing: Header lines without tab are treated as data lines. Thus, they lead to a crash because the second element is accessed but not present. I guess you should only consider lines that have three or more elements after splitting.
Here the command and output:
java -cp abra.jar abra.KmerSizeEvaluator 100 hg19.fa /tmp/test 1 test.bedLoading reference map: /tmp/local_ngs_data/hg19.fa Chromosome: chrM length: 16571 Chromosome: chr1 length: 249250621 ... Done loading ref map. Elapsed secs: 179 Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1 at abra.RegionLoader.load(RegionLoader.java:45) at abra.ReAligner.getRegions(ReAligner.java:702) at abra.KmerSizeEvaluator.run(KmerSizeEvaluator.java:50) at abra.KmerSizeEvaluator.main(KmerSizeEvaluator.java:240)
And this is the BED file:
cat test.bedbrowser position chr7:127471196-127495720 browser hide all track name="ItemRGBDemo" description="Item RGB demonstration" visibility=2 chr7 127471196 127472363 Pos1 0 + 127471196 127472363 255,0,0 chr7 127472363 127473530 Pos2 0 + 127472363 127473530 255,0,0 chr7 127473530 127474697 Pos3 0 + 127473530 127474697 255,0,0 chr7 127474697 127475864 Pos4 0 + 127474697 127475864 255,0,0 chr7 127475864 127477031 Neg1 0 - 127475864 127477031 0,0,255 chr7 127477031 127478198 Neg2 0 - 127477031 127478198 0,0,255 chr7 127478198 127479365 Neg3 0 - 127478198 127479365 0,0,255 chr7 127479365 127480532 Pos5 0 + 127479365 127480532 255,0,0 chr7 127480532 127481699 Neg4 0 - 127480532 127481699 0,0,255
Best regards, Marc