moni
moni copied to clipboard
sd_vector_builder: requested capacity is larger than vector size.
Hello, unfortunately, after installing Moni with the .sh script, I am now facing another issue. When I try to execute the example code:
moni build -r data/SARS-CoV2/SARS-CoV2.1k.fa.gz -o sars-cov2 -f
I get the following error:
==== Command line: ./moni-0.2.0-Linux/bin/newscanNT.x ./SARS-CoV2.1k.fa.gz -w 10 -p 100 -f -s Windows size: 10 Stop word modulus: 100 Total input symbols: 0 Found 1 distinct words Parsing took: 0 wall clock seconds Sum of lenghts of dictionary words: 11 Total number of words: 1 Writing plain dictionary and occ file Dictionary construction took: 0 wall clock seconds Generating remapped parse file Remapping parse file took: 0 wall clock seconds ==== Elapsed time: 0 wall clock seconds malloc_count ### exiting, total: 90960, peak: 54361, current: 4096 [INFO] 14:49:21 - Message: Building the sequence index terminate called after throwing an instance of 'std::runtime_error' what(): sd_vector_builder: requested capacity is larger than vector size.
I have tried to look for it online, but unfortunately I cannot find anything.
Hi, Andrea,
Something doesn't seem right. It seems the input is empty somehow: Total input symbols: 0
Can you see if the gzipped file actually contains something or not?
Ok, thanks. With the Github's SARS file it works, so the problem is that I am probably creating my files in a wrong manner. As far as I understand, I should always build a text file with a single, contiguous, string on a single line, then compress it using gzip (?)
Specifically, in my case, I am considering binary strings, made by a lot of 0s and sparse groups of 1s. The reference string is typically around 1 million characters long, while the other one is typically around 50.000.
I see. The issue should not be in compressing or using a single line (as long as you use the -f
flag meaning the input is FASTA format). If you use binary strings, you may need to play with the -p
and -w
parameters to allow the trigger strings to be set. Default values are -w 10
and -p 100
. In your case I would probably try -w 5
and -p 30
or something on this line.
Alright, thank you. How should I interpret those parameters, intuitively? I have read the original article, but I am not quite sure.