Unicycler
Unicycler copied to clipboard
Hybrid assembly stuck at removing graph overlaps
Hello,
I am running an hybrid assembly by inputting raw Nanopore reads, the assembly produced by Canu (v1.9), and mapped Illumina PE reads to Unicycler (v0.4.9b). The genome size is around 1.2Mb. I launched it on a cluster and allocated 24cpus and 200Gb.
Unicycler never goes further than graph cleaning (all previous steps complete quickly in less than 1 day), which seems to go fast at first:
[93m[1m[4mCleaning graph[0m (2020-04-24 04:45:49)
Unicycler now performs various cleaning procedures on the graph to remove
overlaps and simplify the graph structure. The end result is a graph ready for
bridging.
Graph overlaps removed
Removed zero-length segments:
5200, 5203, 5209, 5232, 5260, 5271, 5272, 5278, 5285, 5316, 5317, 5345,
5353, 5354, 5355, 5366, 5374, 5377, 5383, 5402, 5447, 5453, 5455, 5530, 5537,
5539, 5541, 5554, 5557, 5566, 5569, 5571, 5578, 5592, 5621, 5624, 5626, 5653,
5654, 5658, 5675, 5696, 5708, 5713, 5756, 5759, 5778, 5779, 5865, 5944, 5960,
5973, 5982, 5990, 5993, 5994, 6013, 6018, 6019, 6025, 6027, 6029, 6030, 6033,
6034, 6047, 6052, 6061, 6062, 6063, 6064, 6101, 6106, 6108, 6122, 6155, 6158,
6198, 6224, 6243, 6253, 6269, 6279, 6296, 6299, 6309, 6314, 6321, 6341, 6349,
6368, 6398, 6401, 6422, 6423, 6426, 6430, 6434, 6452, 6457, 6458, 6464, 6470,
6484, 6489, 6490, 6493, 6497, 6503, 6505, 6506, 6515, 6528, 6530, 6544, 6552,
6553, 6569, 6572, 6586, 6611, 6617, 6620, 6659, 6664, 6665, 6667, 6683, 6684,
6686, 6693, 6704, 6715, 6720, 6727, 6761, 6805, 6826, 6835, 6836, 6841, 6864,
...........
But at some stage, it seems to proceed much more slowly:
Removed zero-length segments:
5739, 6065, 7915, 15330, 17088, 19126, 19367, 19676, 20309, 21367, 21843,
22049, 22367, 22474, 22692, 22744, 23024, 23164, 23369, 23526, 23602, 23624,
23980, 24182, 24485, 24554, 24706, 25094, 25287, 25311, 25364, 25482, 26189,
26304, 26627, 26642, 26647, 26693, 26704, 26705
Removed zero-length segments:
20863, 22537, 22548, 22685, 22708, 23700, 24285, 24707, 25695, 26711,
26721, 26722, 26723
Removed zero-length segments:
14230, 22710, 22736, 22758, 22971, 23793, 24284, 26725, 26727, 26728
Removed zero-length segments:
21253, 24585, 24739, 25734, 25814, 26729, 26730, 26732, 26733
Removed zero-length segments:
5255, 23122, 24738, 26734, 26735, 26736
Removed zero-length segments:
26740, 26741
And this lasts until the end of the allocated job (21 days!):
Removed zero-length segments:
8051804, 8051805
Removed zero-length segments:
8051806, 8051807
Removed zero-length segments:
8051808, 8051809
Any idea what is going on here?
Thanks in advance for any help! Simon