masurca icon indicating copy to clipboard operation
masurca copied to clipboard

quick-run issue: super reads file not found or size zero

Open tauanajc opened this issue 3 years ago • 6 comments

Hi there!

I recently installed v4.0.3 and tried running the new quick run of MaSuRCA. It seemed fine until it failed with the following error:

super reads file not found or size zero, you can try deleting work1 folder and re-generating assemble.sh, also check if guillaumeKUnitigsAtLeast32bases_all.fasta is not empty
cat: CA_dir.txt: No such file or directory

I checked and the guillaume file is not empty and looks fine.

Output file looks like this:

Verifying PATHS...
jellyfish OK
runCA OK
createSuperReadsForDirectory.perl OK
creating script file for the actions...done.
execute assemble.sh to run assembly
[Thu Apr 15 17:38:27 EDT 2021] Processing pe library reads
[Thu Apr 15 17:52:00 EDT 2021] Average PE read length 140
[Thu Apr 15 17:52:00 EDT 2021] Using kmer size of 49 for the graph
[Thu Apr 15 17:52:01 EDT 2021] MIN_Q_CHAR: 33
WARNING: JF_SIZE set too low, increasing JF_SIZE to at least 2536856844, this automatic increase may be not enough!
[Thu Apr 15 17:52:01 EDT 2021] Creating mer database for Quorum
[Thu Apr 15 18:06:53 EDT 2021] Error correct PE
[Thu Apr 15 18:44:17 EDT 2021] Estimating genome size
[Thu Apr 15 18:53:13 EDT 2021] Estimated genome size: 381409113
[Thu Apr 15 18:53:13 EDT 2021] Creating k-unitigs with k=49
[Thu Apr 15 19:22:14 EDT 2021] Computing super reads from PE 
[Thu Apr 15 21:33:17 EDT 2021] Using CABOG from /n/holylfs04/LABS/giribet_lab/Lab/tauanajc/scripts/MaSuRCA-4.0.3/bin/../CA8/Linux-amd64/bin
[Thu Apr 15 21:33:18 EDT 2021] Assembly stopped or failed, see .log

The pe.cor.tmp.log has several lines with Skipped reads, saying: No high quality mer.

I've looked at many other issues here on GitHub but didn't find something that looked the same. Any ideas of what could be happening?

My command was: masurca -t 32 -i $illumina1,$illumina2 -r $ont The genome is about 135MB as estimated by GenomeScope, although MaSuRCA is putting it at 380MB.

Thanks in advance!

tauanajc avatar Apr 16 '21 14:04 tauanajc

I have the similar trubble too. May you check if a test genome would assemble? It may be a memory problem, a failed jellyfish while compiling MaSuRCA, or due a very high coverage of PE reads.

asan-emirsaleh avatar Jun 28 '21 14:06 asan-emirsaleh

I also have the same problem. Did you find what is the cause of it and how to fix it?

Panas

panastheod avatar Aug 04 '21 15:08 panastheod

I also have the same problem. Any news? Thanks

josruirod avatar Nov 09 '21 09:11 josruirod

I met the same problem. Did you find how to resolve it? Thanks

lly1214 avatar Feb 16 '22 07:02 lly1214

I am facing the same issue, has anyone managed to find a solution for this?

surabhiranavat avatar Mar 13 '23 10:03 surabhiranavat