minimap2 icon indicating copy to clipboard operation
minimap2 copied to clipboard

distinct minimizer count integer overflow with large reference fasta

Open cjprybol opened this issue 2 months ago • 0 comments

Hi @lh3,

I'm using slices of NCBI's NT database for pacbio hifi read mapping. I noticed that the number of minimizers seems to have overflowed the integer count. Would you expect this to impact the trustworthiness of the mapping results, or is this just a cosmetic issue in the stderr reporting?

prior mapping run with nt_others, number of minimizers are positive and percentages are < 100

[M::mm_idx_stat::128.816*2.87] distinct minimizers: 232724554 (61.91% are singletons); average occurrences: 2.649; average spacing: 10.138; total length: 6250598365

mapping nt_prok, number of minimizers are negative

[M::mm_idx_stat::4819.440*3.47] distinct minimizers: -814309831 (-245.31% are singletons); average occurrences: -30.207; average spacing: 10.032; total length: 246767835107

I believe that I am using the latest current release [M::main] Version: 2.28-r1209

Thank you, Cameron

cjprybol avatar Apr 14 '24 15:04 cjprybol