masurca
masurca copied to clipboard
error_corrected2frg segmentation fault
Hi there,
I am running assembly using both Illumina and PacBio sequencing data, this assembly run stopped at "error_corrected2frg pe 400 50 2000000000 pe.tmp > pe.linking.frg.tmp" due to the "segmentation fault" it raised. I am using "MaSuRCA-3.2.4" and I did run it successfully many times with other datasets, and this is the first time I had this segmentation fault for "error_corrected2frg". I checked the "pe.tmp" file, it's a complete file and in FASTA format. Since there isn't much information the error message provides, any suggestions would be appreciated. I can share the "pe.tmp" file even though it's a 16G size file... Thank you in advance.
Best, Lin
Dear Lin, I got the same error. Did you succeed to use masurca? In my case I have tried the assembly using PE libraries with 2 different reads length, 250 and 100 bp. The error was specifically with the two libraries with 100bp reads while the 250 run fine. What was your reads length? Olivier
Dear, olivierarmant
In our laboratory, we faced the same problem. We solved this by rewriting the error_corrected2frg original C++ code in python. This works great in our case.
#!/usr/bin/env python
import sys
name_seq = []
ind = -1
with open(sys.argv[5]) as input_file:
lines = input_file.readlines()
sys.stdout.write("{"f"VER\nver:2\n"+"}"+"\n""{"
f"LIB\nact:A\nacc:{sys.argv[1]}\nori:I\n"
f"mea:{sys.argv[2]}\nstd:{sys.argv[3]}\nsrc:\n.\n"
f"nft:1\nfea:\n"
f"doNotOverlapTrim=1\n.\n""}""\n")
for line in lines:
if line.startswith('>'):
line = line.split(" ")[0]
line = line[3:]
name_seq.append(int(line))
ind += 1
else:
sys.stdout.write("{FRG"f"\nact:A\nacc:pa{name_seq[ind]}"
f"\nrnd:0\nsta:G\nlib:pa\npla:0\nloc:0"
f"\nsrc:\n.\nseq:\n{line}.\nqlt:"
f"\n{'E'*(len(line)-1)}\n.\nhps:\n."
f"\nclr:0,{len(line)-1}\n""}""\n")
name_seq.sort()
for i in range(0, len(name_seq), 2):
sys.stdout.write("{"f"LKG\nact:A\nfrg:pa{name_seq[i]}\n"
f"frg:pa{name_seq[i+1]}\n""}""\n")
genaev's work around worked for me to get around this issue!