HiCAssembler
HiCAssembler copied to clipboard
Pipeline crashes if no good bin on contig
Hi,
it happened to me that the pipeline crashes in the last step of outputting the final fasta file with an error like this:
DEBUG:HiCAssembler:reordering matrix
DEBUG:HiCAssembler:{0: [], 1: [], 2: [[('path5', 6186, 23985118, '-'), ('path23', 8764, 914248, '+'), ('path14', 1999, 2167025, '+'), ('path17', 7468, 1634331, '-'), ('path218', 2924, 63973, '+'), ('path142', 0, 97717, '+'), ('path50', 0, 869996, '-'), ('path164', 7819, 86474, '+'), ('path45', 22509, 615142, '-'), ('path96', 5452, 73270, '+'), ('path51', 105763, 625222, '+'), ('path31', 4143, 570787, '+'), ('path28', 35021, 410009, '-'), ('path25', 438085, 1630306, '-')]], 3: [], 4: [], 5: []}
[13211871, 23690440, 31875667]
Traceback (most recent call last):
File "/usr/local/bin/assemble", line 312, in <module>
main(args)
File "/usr/local/bin/assemble", line 308, in main
chain_file=args.outFolder + "/liftover.chain")
File "/usr/local/bin/assemble", line 198, in save_fasta
sequence += record_dict[contig_id][start:end].reverse_complement()
KeyError: 'path100'
If I look into path 100 then it seems it first gets split because of a potential disassembly but then not enough good bins are present and both fragments are skipped.
INFO:HiCAssembler:Removing 1 misassemblies for path100
path100/1 has few bins (1). Skipping it
path100/2 has few bins (3). Skipping it
I assume that in such a case not a single entry for this path exists later on and when the pipeline looks again for the identifier it crashes.
Do you have a quick fix for this?
Thank you, Dominik