ntCard
ntCard copied to clipboard
integer underflow bug?
A user of RNA-Bloom had discovered a bug in ntCard version 1.2.2: https://github.com/bcgsc/RNA-Bloom/issues/43
Here are the first 10 lines from the output histogram file.
>head rnabloom_out/rnabloom_k17.hist
F1 270
F0 0
1 9223372036854775808
2 9223372036854775808
3 9223372036854775808
4 9223372036854775808
5 9223372036854775808
6 9223372036854775808
7 9223372036854775808
8 9223372036854775808
The output is bogus.
Shouldn't F0
be >= 1 if F1
is > 1?
The values for the histogram are all 9223372036854775808
. I wonder if this is an integer underflow?
Here is the exact command that was used:
ntcard -t 8 -k 17 -c 65535 -p ntcard_test filtered.fastq
I was able to replicate the exact same output as well.
I think it is an underflow but I thought I fixed this. Just confirming but your filtered.fastq is extremely small right?
You are right, it should be an underflow instead of an overflow. The FASTQ has only 3 reads.
so as I understand the problem is there isn't enough reads to sample and every value should be 0 in that tsv aside from F1 so that is definitely a bug but what is the user's intention in using a file with only three reads?
so as I understand the problem is there isn't enough reads to sample and every value should be 0 in that tsv aside from F1 so that is definitely a bug but what is the user's intention in using a file with only three reads?
It's easier/better to run an assembler on a file and see if it fails or succeeds to assemble anything rather than predict if/when it will fail or succeed.
@jowong4 I remember you fixed this issue, right? need to add an error exception for ntcard on too few read/kmers.
We had quite a picnic with this bug, in our case it happened because ntCard uses the name of the read to determine the type of the file, not sure why this is even needed.
The offending function is: getftype @ ntcard.cpp