spades icon indicating copy to clipboard operation
spades copied to clipboard

Spades-hammer stuck on reverse reads

Open alexhbnr opened this issue 5 years ago • 5 comments

When running metaspades on a paired-end FastQ files with error correction enabled, spades-hammer process through counting the k-mers on the forward reads, but gets stuck on doing the same on the reverse reads.

While it took less than 20 minutes on the forward reads, it did not proceed through the reverse reads after more than 4 hours when I terminated the process. The process was running on full force all time using all specified CPUs but didn't proceed to the next step.

I started analysing the data on multiple different machines while specifying different amounts of max memory and number of threads, but I always end up with the same results. Other samples from the same sequencing run could be processed with issues.

Any ideas what this could be caused by and how I circumvent it?

Here is the params.txt:

Command line: /home/alexhbnr/miniconda3/bin/metaspades.py      -o      /tmp/test_sample      -1      /tmp/test_sample_1.fastq.gz   -2
      /tmp/test_sample_2.fastq.gz   --mem   500     --threads       24

System information:
  SPAdes version: 3.13.1
  Python version: 3.7.1
  OS: Linux-4.4.0-128-generic-x86_64-with-debian-stretch-sid

Output dir: /tmp/test_sample
Mode: read error correction and assembling
Debug mode is turned OFF

Dataset parameters:
  Metagenomic mode
  Reads:
    Library number: 1, library type: paired-end
      orientation: fr
      left reads: ['/tmp/test_sample_1.fastq.gz']
      right reads: ['/tmp/test_sample_2.fastq.gz']
      interlaced reads: not specified
      single reads: not specified
      merged reads: not specified
Read error correction parameters:
  Iterations: 1
PHRED offset will be auto-detected
  Corrected reads will be compressed
Assembly parameters:
  k: [21, 33, 55]
  Repeat resolution is enabled
  Mismatch careful mode is turned OFF
  MismatchCorrector will be SKIPPED
  Coverage cutoff is turned OFF
Other parameters:
  Dir for temp files: /tmp/test_sample/tmp
  Threads: 24
  Memory limit (in Gb): 500

And the spades.log:

===== Read error correction started.


== Running read error correction tool: /home/alexhbnr/miniconda3/share/spades-3.13.1-0/bin/spades-hammer /tmp/test_sample/corrected/configs/config.info

  0:00:00.000     4M / 4M    INFO    General                 (main.cpp                  :  75)   Starting BayesHammer, built from refs/heads/spades_3.13.1, git revision 9a9d54db2ff9abaac718155bf74c12ec9464e8ca
  0:00:00.000     4M / 4M    INFO    General                 (main.cpp                  :  76)   Loading config from /tmp/test_sample/corrected/configs/config.info
  0:00:00.001     4M / 4M    INFO    General                 (main.cpp                  :  78)   Maximum # of threads to use (adjusted due to OMP capabilities): 24
  0:00:00.001     4M / 4M    INFO    General                 (memory_limit.cpp          :  49)   Memory limit set to 500 Gb
  0:00:00.001     4M / 4M    INFO    General                 (main.cpp                  :  86)   Trying to determine PHRED offset
  0:00:00.042     4M / 4M    INFO    General                 (main.cpp                  :  92)   Determined value is 33
  0:00:00.042     4M / 4M    INFO    General                 (hammer_tools.cpp          :  36)   Hamming graph threshold tau=1, k=21, subkmer positions = [ 0 10 ]
  0:00:00.042     4M / 4M    INFO    General                 (main.cpp                  : 113)   Size of aux. kmer data 24 bytes
     === ITERATION 0 begins ===
  0:00:00.042     4M / 4M    INFO   K-mer Counting           (kmer_data.cpp             : 280)   Estimating k-mer count
  0:00:00.301   388M / 388M  INFO   K-mer Counting           (kmer_data.cpp             : 285)   Processing /tmp/test_sample_1.fastq.gz
  0:02:05.050   480M / 480M  INFO   K-mer Counting           (kmer_data.cpp             : 294)   Processed 48218956 reads
  0:02:05.050   480M / 480M  INFO   K-mer Counting           (kmer_data.cpp             : 285)   Processing /tmp/test_sample_2.fastq.gz
  0:04:08.661   480M / 480M  INFO   K-mer Counting           (kmer_data.cpp             : 294)   Processed 96437912 reads
  0:04:08.661   480M / 480M  INFO   K-mer Counting           (kmer_data.cpp             : 299)   Total 96437912 reads processed
0:04:10.448   480M / 480M  INFO   K-mer Counting           (kmer_data.cpp             : 306)   Estimated 2102283700 distinct kmers
  0:04:10.477    96M / 480M  INFO   K-mer Counting           (kmer_data.cpp             : 311)   Filtering singleton k-mers
40 8 0
nslots: 4294967296
bits per slot: 8 range: 0000010000000000
  0:04:10.477     5G / 5G    INFO   K-mer Counting           (kmer_data.cpp             : 317)   Processing /tmp/test_sample_1.fastq.gz
  0:22:12.735     5G / 5G    INFO   K-mer Counting           (kmer_data.cpp             : 326)   Processed 48218956 reads
  0:22:12.736     5G / 5G    INFO   K-mer Counting           (kmer_data.cpp             : 317)   Processing /tmp/test_sample_2.fastq.gz

alexhbnr avatar Jun 12 '19 18:06 alexhbnr

Yes, this is a known problem with CQF module. There is no workaround yet...

asl avatar Jun 12 '19 18:06 asl

I'm having the same issue... still no solution?

jaybake5 avatar Sep 25 '19 17:09 jaybake5

I'm getting the same issue on version 3.15.2 - weird thing is it happens for one sample and not for another one... Help? @asl

GeoMicroSoares avatar May 04 '21 06:05 GeoMicroSoares

Hi I think i'm having the same problem is there still no solution ?

kalonji08 avatar Jun 14 '22 03:06 kalonji08

HI.

I hope this will help in solving this problem in future releases.

Sample background: Mice faecal metagenome Sequencer: HiSeq 2500 250bp pair-end reads Number of samples: 4

QC KneadData (filtering of PhiX, mouse, and human genome)

Error-correction bfc

Assembly SPAdes 3.15.4 command : spades.py --meta -1 sample1_R1_bfc_corrected.fq.gz -2 sample1_R2_bfc_corrected.fq.gz -o sample1 -t 10 -m 150

Here, two of my sample were successfully assembled, and two stuck at Spades-hammer reverse read K-mer Counting.

I repeated every step from QC for these two failed samples but again, the same problem.

So I skipped bfc correction for these two samples, leading to successful assembly using the above command.

I was just wondering if you haven't performed bfc correction, and your assembly is stuck at Spades-hammer reverse read K-mer Counting, will bfc correction help?

Regards,

Bhim

bhimbbiswa avatar Jun 27 '22 01:06 bhimbbiswa