HiCAssembler icon indicating copy to clipboard operation
HiCAssembler copied to clipboard

Pipeline crashes if no good bin on contig

Open dominik-handler opened this issue 4 years ago • 0 comments

Hi,

it happened to me that the pipeline crashes in the last step of outputting the final fasta file with an error like this:

DEBUG:HiCAssembler:reordering matrix
DEBUG:HiCAssembler:{0: [], 1: [], 2: [[('path5', 6186, 23985118, '-'), ('path23', 8764, 914248, '+'), ('path14', 1999, 2167025, '+'), ('path17', 7468, 1634331, '-'), ('path218', 2924, 63973, '+'), ('path142', 0, 97717, '+'), ('path50', 0, 869996, '-'), ('path164', 7819, 86474, '+'), ('path45', 22509, 615142, '-'), ('path96', 5452, 73270, '+'), ('path51', 105763, 625222, '+'), ('path31', 4143, 570787, '+'), ('path28', 35021, 410009, '-'), ('path25', 438085, 1630306, '-')]], 3: [], 4: [], 5: []}
[13211871, 23690440, 31875667]
Traceback (most recent call last):
  File "/usr/local/bin/assemble", line 312, in <module>
    main(args)
  File "/usr/local/bin/assemble", line 308, in main
    chain_file=args.outFolder + "/liftover.chain")
  File "/usr/local/bin/assemble", line 198, in save_fasta
    sequence += record_dict[contig_id][start:end].reverse_complement()
KeyError: 'path100'

If I look into path 100 then it seems it first gets split because of a potential disassembly but then not enough good bins are present and both fragments are skipped.

INFO:HiCAssembler:Removing 1 misassemblies for path100

path100/1 has few bins (1). Skipping it
path100/2 has few bins (3). Skipping it

I assume that in such a case not a single entry for this path exists later on and when the pipeline looks again for the identifier it crashes.

Do you have a quick fix for this?

Thank you, Dominik

dominik-handler avatar May 05 '20 11:05 dominik-handler