sdsl-lite
sdsl-lite copied to clipboard
Maximum limit on reference and query string length
Many thanks for providing this library!
I'm using the following code snippet as part of my application
csa_wt<> fm_index;
construct_im(fm_index, "mississippi!", 1);
std::cout << "'si' occurs " << count(fm_index,"si") << " times.\n";
But instead of "mississippi", I have a string of about 60 billion characters. This string is constructed by concatenating long reads (20x coverage of human genome), while using "$" symbol as separator. My query sequences are also long reads whose length can exceed 1M nucleotide characters. My overall codebase is mis-behaving (it finished, but produced incorrect results). While I'm starting to debug this now, I'm wondering if I am exceeding the string length limits of SDSL-LITE? This codebase is working fine for smaller datasets derived from bacterial genomes.
I'm using the latest code from master branch (commit: c32874c
).
Thanks!