snippy icon indicating copy to clipboard operation
snippy copied to clipboard

My job has been running for more than 5 days with no update. Can anyone help me?

Open ec-ho-ra-mos opened this issue 3 years ago • 0 comments

Objective: I am trying to use Snippy to generate a .vcf SNP discovery file using (1) .fasta and (2) .fastq as input files.

Problem: After 5 days (8 cpus), there is no change. I think the run encountered an error?

This is my scripts:

snippy \
--subsample 0.01 \
--force \
--report \
--cpus 8 \
--outdir ***/12_snps/ \
--ref ***/f1_trinity_out_dir.Trinity.fasta  \ ##(fasta size = 608 MB)
--R1 ***/F1-1A_S1_R1_001.fastq.gz \  ##(fastq size = 6.05 GB)
--R2 ***/F1-1A_S1_R2_001.fastq.gz  ##(fastq size =6.09GB)

This is the output:

[18:37:12] This is snippy 4.6.0
[18:37:12] Written by Torsten Seemann
[18:37:12] Obtained from https://github.com/tseemann/snippy
[18:37:12] Detected operating system: linux
[18:37:12] Enabling bundled linux tools.
[18:37:12] Found bwa - /snippy-4.6.0/binaries/linux/bwa
[18:37:12] Found bcftools - /snippy-4.6.0/binaries/linux/bcftools
[18:37:12] Found samtools - /snippy-4.6.0/binaries/linux/samtools
[18:37:12] Found java - /usr/bin/java
[18:37:12] Found snpEff - /snippy-4.6.0/binaries/noarch/snpEff
[18:37:12] Found samclip - /snippy-4.6.0/binaries/noarch/samclip
[18:37:12] Found seqtk - /snippy-4.6.0/binaries/linux/seqtk
[18:37:12] Found parallel - /snippy-4.6.0/binaries/noarch/parallel
[18:37:12] Found freebayes - /snippy-4.6.0/binaries/linux/freebayes
[18:37:12] Found freebayes-parallel - /snippy-4.6.0/binaries/noarch/freebayes-parallel
[18:37:12] Found fasta_generate_regions.py - /snippy-4.6.0/binaries/noarch/fasta_generate_regions.py
[18:37:12] Found vcfstreamsort - /snippy-4.6.0/binaries/linux/vcfstreamsort
[18:37:12] Found vcfuniq - /snippy-4.6.0/binaries/linux/vcfuniq
[18:37:12] Found vcffirstheader - /snippy-4.6.0/binaries/noarch/vcffirstheader
[18:37:12] Found gzip - /bin/gzip
[18:37:12] Found vt - /snippy-4.6.0/binaries/linux/vt
[18:37:12] Found snippy-vcf_to_tab - /snippy-4.6.0/bin/snippy-vcf_to_tab
[18:37:12] Found snippy-vcf_report - /snippy-4.6.0/bin/snippy-vcf_report
[18:37:12] Checking version: samtools --version is >= 1.7 - ok, have 1.10
[18:37:12] Checking version: bcftools --version is >= 1.7 - ok, have 1.10
[18:37:12] Checking version: freebayes --version is >= 1.1 - ok, have 1.3.1
[18:37:13] Checking version: snpEff -version is >= 4.3 - ok, have 4.3
[18:37:13] Checking version: bwa is >= 0.7.12 - ok, have 0.7.17
[18:37:13] Using reference: ***/f1_trinity_out_dir.Trinity.fasta
[18:37:15] Treating reference as 'fasta' format.
[18:37:15] Will use 8 CPU cores.
[18:37:15] Using read file: ***/F1-1A_S1_R1_001.fastq.gz
[18:37:15] Using read file: ***/F1-1A_S1_R2_001.fastq.gz
[18:37:15] Creating folder: ***/12_snps/
[18:37:15] Changing working directory: ***/12_snps/
[18:37:15] Creating reference folder: reference
[18:37:15] Extracting FASTA and GFF from reference.
[18:38:57] Wrote 653007 sequences to ref.fa
[18:38:57] Wrote 0 features to ref.gff
[18:38:57] Sub-sampling reads at rate 0.01
[18:38:57] Freebayes will process 15 chunks of 42164290 bp, 8 chunks at a time.
[18:38:57] Using BAM RG (Read Group) ID: 12_snps
[18:38:57] Running: seqtk sample \***/F1\-1A_S1_R1_001\.fastq\.gz 0.01 > subsampled\.F1\-1A_S1_R1_001\.fastq 2>> snps.log
[18:39:45] Running: seqtk sample \***/F1\-1A_S1_R2_001\.fastq\.gz 0.01 > subsampled\.F1\-1A_S1_R2_001\.fastq 2>> snps.log
[18:40:33] Running: samtools faidx reference/ref.fa 2>> snps.log
[18:40:37] Running: bwa index reference/ref.fa 2>> snps.log
[18:54:54] Running: mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa 2>> snps.log
[18:54:55] Running: ln -sf reference/ref.fa . 2>> snps.log
[18:54:55] Running: ln -sf reference/ref.fa.fai . 2>> snps.log
[18:54:55] Running: mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log
[18:54:56] Running: bwa mem  -Y -M -R '@RG\tID:12_snps\tSM:12_snps' -t 8 reference/ref.fa subsampled.F1-1A_S1_R1_001.fastq subsampled.F1-1A_S1_R2_001.fastq | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /tmp --threads 3 -m 2000M | samtools fixmate -m --threads 3 - - | samtools sort -l 0 -T /tmp --threads 3 -m 2000M | samtools markdup -T /tmp --threads 3 -r -s - - > snps.bam 2>> snps.log
[samclip] samclip 0.4.0 by Torsten Seemann (@torstenseemann)
[samclip] Loading: reference/ref.fa.fai
[samclip] Found 653007 sequences in reference/ref.fa.fai
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 370944 sequences (49760917 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 8898, 1, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (117, 143, 171)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (9, 279)
[M::mem_pestat] mean and std.dev: (145.42, 42.34)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 333)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 370944 reads in 382.644 CPU sec, 48.752 real sec
[samclip] Processed 100000 records...
[samclip] Processed 200000 records...
[samclip] Processed 300000 records...
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -Y -M -R u/RG\tID:12_snps\tSM:12_snps -t 8 reference/ref.fa subsampled.F1-1A_S1_R1_001.fastq subsampled.F1-1A_S1_R2_001.fastq
[main] Real time: 56.097 sec; CPU: 387.107 sec
[samclip] Total SAM records 377055, removed 16221, allowed 17114, passed 360834
[samclip] Header contained 653009 lines
[samclip] Done.
[bam_sort_core] merging from 0 files and 3 in-memory blocks...
[bam_sort_core] merging from 0 files and 3 in-memory blocks...
[18:56:02] Running: samtools index snps.bam 2>> snps.log
[18:56:03] Running: fasta_generate_regions.py reference/ref.fa.fai 42164290 > reference/ref.txt 2>> snps.log
[18:56:06] Running: freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf   -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log

It just stays like that for more than 5 days. No change. Do you think it might be that the job run into an error? Can anyone suggest something?

ec-ho-ra-mos avatar Apr 08 '21 18:04 ec-ho-ra-mos