gapless
gapless copied to clipboard
pipeline crashed : scaffold (Scaffold graph is inconsistent)
Hi, I only managed to run gapless
on my ONT assembly (~400 contigs, haploid genome size of 3.1Gbp, there are no gaps in contigs) up to gapless.py scaffold
stage in the pipeline.
My command was
gapless.sh -i asm.fa -o gapless_out -t nanopore -j 18 ONT_treads.fastq.gz
Here are the contents of gapless_scaffold.log
, where multiple errors are being logged:
/home/duda5/anaconda3/lib/python3.9/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
/home/duda5/anaconda3/lib/python3.9/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
/home/duda5/soft/gapless/gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator
ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), ""))
/home/duda5/anaconda3/lib/python3.9/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
/home/duda5/soft/gapless/gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator
ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), ""))
/home/duda5/anaconda3/lib/python3.9/site-packages/seaborn/distributions.py:2619: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).
warnings.warn(msg, FutureWarning)
/home/duda5/soft/gapless/gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator
ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), ""))
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
/home/duda5/anaconda3/lib/python3.9/site-packages/scipy/stats/_discrete_distns.py:315: RuntimeWarning: divide by zero encountered in _nbinom_cdf
return _boost._nbinom_cdf(k, n, p)
0:00:02.213879 Reading in original assembly
0:00:04.935302 Loading repeats
0:00:05.341227 Filtering mappings
0:01:44.997825 Search for possible break points
0:08:56.882098 Search for possible bridges
0:09:24.659574 Scaffold the contigs
sindex from from_side scaf1 ... dist17 scaf18 strand18 dist18
0 24199 30.0 l 27.0 ... NaN NaN NaN NaN
1 24200 27.0 r 29.0 ... NaN NaN NaN NaN
2 24202 27.0 l 30.0 ... NaN NaN NaN NaN
3 24203 29.0 r 30.0 ... NaN NaN NaN NaN
4 24204 43.0 r 41.0 ... NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ...
3137 23371 899.0 l 900.0 ... 0.0 NaN NaN NaN
3138 21981 901.0 r 902.0 ... 0.0 NaN NaN NaN
3139 23785 899.0 l 900.0 ... 0.0 NaN NaN NaN
3140 15821 1113.0 l 906.0 ... 0.0 899.0 - 0.0
3141 19043 906.0 r 905.0 ... 0.0 904.0 - 0.0
[3142 rows x 58 columns]
Traceback (most recent call last):
File "/home/duda5/soft/gapless/gapless.py", line 13327, in <module>
main(sys.argv[1:])
File "/home/duda5/soft/gapless/gapless.py", line 13156, in main
GaplessScaffold(args[0], args[1], args[2], min_mapq, min_mapping_length, min_length_contig_break, prefix, stats)
File "/home/duda5/soft/gapless/gapless.py", line 9101, in GaplessScaffold
scaffold_paths, trim_repeats = ScaffoldContigs(contig_parts, bridges, mappings, cov_probs, repeats, prob_factor, min_mapping_length, max_dist_contig_end, prematurity_threshold, ploidy, max_loop_units)
File "/home/duda5/soft/gapless/gapless.py", line 7840, in ScaffoldContigs
scaffold_graph = BuildScaffoldGraph(long_range_connections, scaf_bridges)
File "/home/duda5/soft/gapless/gapless.py", line 2422, in BuildScaffoldGraph
CheckScaffoldGraphConsistency(scaffold_graph)
File "/home/duda5/soft/gapless/gapless.py", line 2358, in CheckScaffoldGraphConsistency
raise RuntimeError("Scaffold graph is inconsistent: Not all reverse entries are present.")
RuntimeError: Scaffold graph is inconsistent: Not all reverse entries are present.
Thank you for reporting this.
The divisions by zero worry me and that the graph is inconsistent should never happen, so this is clearly a bug. To fix this we have two options. The fast one would be to provide me with the ´gapless_split.fa´, ´gapless_reads.paf´, ´gapless_split_repeats.paf´ files (or download links) at [email protected] and I can trace and fix the issue myself. The slow option is that I navigate you through the code so you can find, where and why the issue occurs to allow me to fix it.
Best, Stephan