NanoSim
NanoSim copied to clipboard
Infinite loop in function extract_reads in metagenome mode when length equals max length
In the function extract_reads
when dna_type == "metagenome"
the while loops checks that:
length > max(seq_len[s].values())
and:
length < seq_len[s][key]
but there is no case for:
length == max(seq_len[s].values())
so the program gets stuck in an infinite loop
Thanks for reporting this @danpal96
In the extract_reads
function of simulatory.py
script, the if
clauses are inside a while True
loop. If the condition is met, length < seq_len[s][key]
, then it will proceed, otherwise, if the length is bigger or equal, it will draw another key and checks the if clause until it finds the right key.
I guess you are correct in that case where the length is exactly equal to the max length and therefore, the code keeps generating random keys until it finds it (which may cause an infinite loop).
I am labelling this as a bug for now and my colleagues and I will take a look at it. @cheny19 @kmnip do you have any thoughts on this?
Personally, I am not fond of the strategy of selecting random values within a while-true loop as it can definitely be a potential cause of an infinite loop.
For this part of the code, I think a better alternative is to extract the list of chromosomes that are longer than the read length and select a random chromosome from this list.