snippy icon indicating copy to clipboard operation
snippy copied to clipboard

get 0 variant after snippy

Open Rathanin-github opened this issue 3 years ago • 7 comments

Hi @tseemann When I update my snippy to 4.6.0, I got 0 varaint after runing. snps.csv contain nothing. Please see my snps.log below Your response is greatly apriciate !

echo snippy 4.6.0

cd /mnt/e/test4/17A

/home/nc-bioinfo/anaconda3/envs/snippy/bin/snippy --outdir mysnps17A.fa --ref K96243.fasta -R1 60-101_S1_L001_R1_001.fastq.gz -R2 60-101_S1_L001_R2_001.fastq.gz

samtools faidx reference/ref.fa

bwa index reference/ref.fa

[bwa_index] Pack FASTA... 0.05 sec [bwa_index] Construct BWT for the packed sequence... [bwa_index] 1.41 seconds elapse. [bwa_index] Update BWT... 0.03 sec [bwa_index] Pack forward-only FASTA... 0.02 sec [bwa_index] Construct SA from BWT and Occ... 0.50 sec [main] Version: 0.7.17-r1188 [main] CMD: bwa index reference/ref.fa [main] Real time: 4.973 sec; CPU: 2.000 sec

mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa

ln -sf reference/ref.fa .

ln -sf reference/ref.fa.fai .

mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz

bwa mem -Y -M -R '@RG\tID:mysnps17A.fa\tSM:mysnps17A.fa' -t 8 reference/ref.fa /mnt/e/test4/17A/60-101_S1_L001_R1_001.fastq.gz /mnt/e/test4/17A/60-101_S1_L001_R2_001.fastq.gz | samclip --max 10 --ref reference/ref.fa.fai | samtools

sort -n -l 0 -T /tmp --threads 3 -m 2000M | samtools fixmate -m --threads 3 - - | samtools sort -l 0 -T /tmp --threads 3 -m 2000M | samtools markdup -T /tmp --threads 3 -r -s - - > snps.bam

COMMAND: samtools markdup -T /tmp --threads 3 -r -s - - READ: 3462693 WRITTEN: 3370831 EXCLUDED: 87497 EXAMINED: 3375196 PAIRED: 3318838 SINGLE: 56358 DUPLICATE PAIR: 57848 DUPLICATE SINGLE: 34014 DUPLICATE PAIR OPTICAL: 0 DUPLICATE SINGLE OPTICAL: 0 DUPLICATE NON PRIMARY: 0 DUPLICATE NON PRIMARY OPTICAL: 0 DUPLICATE PRIMARY TOTAL: 91862 DUPLICATE TOTAL: 91862 ESTIMATED_LIBRARY_SIZE: 47047081

samtools index snps.bam

fasta_generate_regions.py reference/ref.fa.fai 491233 > reference/ref.txt

freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf -f reference/ref.fa snps.bam > snps.raw.vcf

sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument sleep: cannot read realtime clock: Invalid argument

bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FOR

MAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf

cp snps.filt.vcf snps.vcf

/home/nc-bioinfo/anaconda3/envs/snippy/bin/snippy-vcf_to_tab --gff reference/ref.gff --ref reference/ref.fa --vcf snps.vcf > snps.tab

Loading reference: reference/ref.fa Loaded 2 sequences. Loading features: reference/ref.gff Parsing variants: snps.vcf Converted 0 SNPs to TAB format.

/home/nc-bioinfo/anaconda3/envs/snippy/bin/snippy-vcf_extract_subs snps.filt.vcf > snps.subs.vcf

bcftools convert -Oz -o snps.vcf.gz snps.vcf

bcftools index -f snps.vcf.gz

bcftools consensus --sample mysnps17A.fa -f reference/ref.fa -o snps.consensus.fa snps.vcf.gz

Applied 0 variants

bcftools convert -Oz -o snps.subs.vcf.gz snps.subs.vcf

bcftools index -f snps.subs.vcf.gz

bcftools consensus --sample mysnps17A.fa -f reference/ref.fa -o snps.consensus.subs.fa snps.subs.vcf.gz

Applied 0 variants

rm -f snps.subs.vcf.gz snps.subs.vcf.gz.csi snps.subs.vcf.gz.tbi

Rathanin-github avatar Jul 24 '20 06:07 Rathanin-github

Hi @Rathanin-github, I also got this issue with a recent re-install of snippy v4.6.0 within the last 2 weeks.

For me, the variants disappear right at this point:

bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - 

I used conda to install snippy, and the dependency [vt] has very recently been updated (Anaconda release log) so I suspected that was the problem. My solution was to downgrade vt to an older, stable version and do a fresh install of snippy:

conda create --name snippy-env
conda install --name snippy-env -c bioconda/label/cf201901 vt==2015.11.10
conda install --name snippy-env -c bioconda snippy

Let me know if this fix is applicable to you, or if you can find a better solution!

ktmeaton avatar Jul 25 '20 17:07 ktmeaton

Also having the same problem, but using older version (4.3.5). I have zero SNPS. I am attaching the log for reference.

This seems to be the error source:

Parse error at line 5 samtools sort: truncated file. Aborting [fputs] Broken pipe

[02:16:39] This is snippy 4.3.5 [02:16:39] Written by Torsten Seemann [02:16:39] Obtained from https://github.com/tseemann/snippy

[02:16:39] Found bwa - /miniconda3/envs/snippy/bin/bwa [02:16:39] Found bcftools - /miniconda3/envs/snippy/bin/bcftools [02:16:39] Found samtools -/miniconda3/envs/snippy/bin/samtools [02:16:39] Found java - /miniconda3/envs/snippy/bin/java [02:16:39] Found snpEff - /miniconda3/envs/snippy/bin/snpEff [02:16:39] Found samclip - /miniconda3/envs/snippy/bin/samclip [02:16:39] Found seqtk - /miniconda3/envs/snippy/bin/seqtk [02:16:39] Found parallel -/miniconda3/envs/snippy/bin/parallel [02:16:39] Found freebayes -miniconda3/envs/snippy/bin/freebayes [02:16:39] Found freebayes-parallel - /miniconda3/envs/snippy/bin/freebayes-parallel [02:16:39] Found fasta_generate_regions.py - /miniconda3/envs/snippy/bin/fasta_generate_regions.py [02:16:39] Found vcfstreamsort - /miniconda3/envs/snippy/bin/vcfstreamsort [02:16:39] Found vcfuniq - /miniconda3/envs/snippy/bin/vcfuniq [02:16:39] Found vcffirstheader -/miniconda3/envs/snippy/bin/vcffirstheader [02:16:39] Found gzip - /usr/bin/gzip [02:16:39] Found vt - /miniconda3/envs/snippy/bin/vt [02:16:39] Found snippy-vcf_to_tab - /miniconda3/envs/snippy/bin/snippy-vcf_to_tab [02:16:39] Found snippy-vcf_report - /miniconda3/envs/snippy/bin/snippy-vcf_report [02:16:39] Checking version: samtools --version is >= 1.7 - ok, have 1.9 [02:16:39] Checking version: bcftools --version is >= 1.7 - ok, have 1.9 [02:16:39] Checking version: freebayes --version is >= 1.1 - ok, have 1.2 [02:16:39] Checking version: snpEff -version is >= 4.3 - ok, have 4.3 [02:16:39] Using reference: /cogseq.fa.split/wuhan.gbk [02:16:39] Treating reference as 'genbank' format. [02:16:39] Will use 8 CPU cores. [02:16:39] Using read file: /fasta_COGS/cogseq.fa.split/England__201380042__2020.fa [02:16:39] Creating folder: England__201380042__2020 [02:16:39] Changing working directory: England__201380042__2020 [02:16:39] Creating reference folder: reference [02:16:39] Extracting FASTA and GFF from reference. [02:16:39] Wrote 1 sequences to ref.fa [02:16:39] Wrote 12 features to ref.gff [02:16:39] Shredding /cogseq.fa.split/England__201380042__2020.fa into pseudo-reads [02:16:39] Wrote 2434 fake 250bp reads (20x, stride 25bp) to fake_reads.fq [02:16:39] Creating reference/snpeff.config [02:16:39] Freebayes will process 15 chunks of 2032 bp, 8 chunks at a time. [02:16:39] Using BAM RG (Read Group) ID: England__201380042__2020 [02:16:39] Running: samtools faidx reference/ref.fa 2>> snps.log [02:16:39] Running: bwa index reference/ref.fa 2>> snps.log [02:16:39] Running: mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa 2>> snps.log [02:16:39] Running: ln -sf reference/ref.fa . 2>> snps.log [02:16:39] Running: ln -sf reference/ref.fa.fai . 2>> snps.log [02:16:39] Running: mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log [02:16:39] Running: snpEff build -c reference/snpeff.config -dataDir . -gff3 ref 2>> snps.log [02:16:41] Running: bwa mem -Y -M -R '@RG\tID:England__201380042__2020\tSM:England__201380042__2020' -t 8 reference/ref.fa fake_reads.fq | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. --threads 8 -m 1000M | samtools fixmate -m - - | samtools sort -l 0 -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. --threads 8 -m 1000M | samtools markdup -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. -r -s - - > snps.bam 2>> snps.log [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 2434 sequences (608500 bp)... [M::mem_process_seqs] Processed 2434 reads in 0.187 CPU sec, 0.027 real sec [samclip] samclip 0.2 by Torsten Seemann (@torstenseemann) [samclip] Loading: reference/ref.fa.fai [samclip] Found 1 sequences in reference/ref.fa.fai [E::sam_parse1] CIGAR and query sequence are of different length [W::sam_read1] Parse error at line 5 samtools sort: truncated file. Aborting [fputs] Broken pipe [02:16:41] Running: samtools index snps.bam 2>> snps.log [02:16:41] Running: fasta_generate_regions.py reference/ref.fa.fai 2032 > reference/ref.txt 2>> snps.log [02:16:42] Running: freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 10 --min-repeat-entropy 1.5 --strict-vcf -q 13 -m 60 --min-coverage 10 -F 0.05 -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log [02:16:42] Running: bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf 2>> snps.log normalize v0.5

options: input VCF file - [o] output VCF file - [w] sorting window size 10000 [m] no fail on masked reference inconsistency false [n] no fail on reference inconsistency false [q] quiet false [d] debug false [r] reference FASTA file reference/ref.fa

stats: biallelic no. left trimmed : 0 no. right trimmed : 0 no. left and right trimmed : 0 no. right trimmed and left aligned : 0 no. left aligned : 0

   total no. biallelic normalized           : 0

   multiallelic
      no. left trimmed                      : 0
      no. right trimmed                     : 0
      no. left and right trimmed            : 0
      no. right trimmed and left aligned    : 0
      no. left aligned                      : 0

   total no. multiallelic normalized        : 0

   total no. variants normalized            : 0
   total no. variants observed              : 0
   total no. reference observed             : 0

Time elapsed: 0.00s

darwinbandoy avatar Jul 28 '20 09:07 darwinbandoy

Hi Katherine Thank you so much for your solution. I can fix my problem by:

conda create --name snippy-env conda install --name snippy-env -c bioconda/label/cf201901 vt==2015.11.10 conda install --name snippy-env -c bioconda snippy

Best regards

Rathanin

On Tue, Jul 28, 2020 at 4:42 PM Darwin Bandoy [email protected] wrote:

Also having the same problem, but using older version (4.3.5). I have zero SNPS. I am attaching the log for reference.

This seems to be the error source:

Parse error at line 5 samtools sort: truncated file. Aborting [fputs] Broken pipe

[02:16:39] This is snippy 4.3.5 [02:16:39] Written by Torsten Seemann [02:16:39] Obtained from https://github.com/tseemann/snippy

[02:16:39] Found bwa - /miniconda3/envs/snippy/bin/bwa [02:16:39] Found bcftools - /miniconda3/envs/snippy/bin/bcftools [02:16:39] Found samtools -/miniconda3/envs/snippy/bin/samtools [02:16:39] Found java - /miniconda3/envs/snippy/bin/java [02:16:39] Found snpEff - /miniconda3/envs/snippy/bin/snpEff [02:16:39] Found samclip - /miniconda3/envs/snippy/bin/samclip [02:16:39] Found seqtk - /miniconda3/envs/snippy/bin/seqtk [02:16:39] Found parallel -/miniconda3/envs/snippy/bin/parallel [02:16:39] Found freebayes -miniconda3/envs/snippy/bin/freebayes [02:16:39] Found freebayes-parallel - /miniconda3/envs/snippy/bin/freebayes-parallel [02:16:39] Found fasta_generate_regions.py - /miniconda3/envs/snippy/bin/fasta_generate_regions.py [02:16:39] Found vcfstreamsort - /miniconda3/envs/snippy/bin/vcfstreamsort [02:16:39] Found vcfuniq - /miniconda3/envs/snippy/bin/vcfuniq [02:16:39] Found vcffirstheader -/miniconda3/envs/snippy/bin/vcffirstheader [02:16:39] Found gzip - /usr/bin/gzip [02:16:39] Found vt - /miniconda3/envs/snippy/bin/vt [02:16:39] Found snippy-vcf_to_tab - /miniconda3/envs/snippy/bin/snippy-vcf_to_tab [02:16:39] Found snippy-vcf_report - /miniconda3/envs/snippy/bin/snippy-vcf_report [02:16:39] Checking version: samtools --version is >= 1.7 - ok, have 1.9 [02:16:39] Checking version: bcftools --version is >= 1.7 - ok, have 1.9 [02:16:39] Checking version: freebayes --version is >= 1.1 - ok, have 1.2 [02:16:39] Checking version: snpEff -version is >= 4.3 - ok, have 4.3 [02:16:39] Using reference: /cogseq.fa.split/wuhan.gbk [02:16:39] Treating reference as 'genbank' format. [02:16:39] Will use 8 CPU cores. [02:16:39] Using read file: /fasta_COGS/cogseq.fa.split/England__201380042__2020.fa [02:16:39] Creating folder: England__201380042__2020 [02:16:39] Changing working directory: England__201380042__2020 [02:16:39] Creating reference folder: reference [02:16:39] Extracting FASTA and GFF from reference. [02:16:39] Wrote 1 sequences to ref.fa [02:16:39] Wrote 12 features to ref.gff [02:16:39] Shredding /cogseq.fa.split/England__201380042__2020.fa into pseudo-reads [02:16:39] Wrote 2434 fake 250bp reads (20x, stride 25bp) to fake_reads.fq [02:16:39] Creating reference/snpeff.config [02:16:39] Freebayes will process 15 chunks of 2032 bp, 8 chunks at a time. [02:16:39] Using BAM RG (Read Group) ID: England__201380042__2020 [02:16:39] Running: samtools faidx reference/ref.fa 2>> snps.log [02:16:39] Running: bwa index reference/ref.fa 2>> snps.log [02:16:39] Running: mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa 2>> snps.log [02:16:39] Running: ln -sf reference/ref.fa . 2>> snps.log [02:16:39] Running: ln -sf reference/ref.fa.fai . 2>> snps.log [02:16:39] Running: mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log [02:16:39] Running: snpEff build -c reference/snpeff.config -dataDir . -gff3 ref 2>> snps.log [02:16:41] Running: bwa mem -Y -M -R '@rg https://github.com/rg\tID:England__201380042__2020\tSM:England__201380042__2020' -t 8 reference/ref.fa fake_reads.fq | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. --threads 8 -m 1000M | samtools fixmate -m - - | samtools sort -l 0 -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. --threads 8 -m 1000M | samtools markdup -T /var/folders/s1/05bh_q_s15n3tdrqnn7r0ktr0000gn/T//snippy.15470. -r -s - - > snps.bam 2>> snps.log [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 2434 sequences (608500 bp)... [M::mem_process_seqs] Processed 2434 reads in 0.187 CPU sec, 0.027 real sec [samclip] samclip 0.2 by Torsten Seemann (@torstenseemann) [samclip] Loading: reference/ref.fa.fai [samclip] Found 1 sequences in reference/ref.fa.fai [E::sam_parse1] CIGAR and query sequence are of different length [W::sam_read1] Parse error at line 5 samtools sort: truncated file. Aborting [fputs] Broken pipe [02:16:41] Running: samtools index snps.bam 2>> snps.log [02:16:41] Running: fasta_generate_regions.py reference/ref.fa.fai 2032 > reference/ref.txt 2>> snps.log [02:16:42] Running: freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 10 --min-repeat-entropy 1.5 --strict-vcf -q 13 -m 60 --min-coverage 10 -F 0.05 -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log [02:16:42] Running: bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL'

snps.filt.vcf 2>> snps.log normalize v0.5

options: input VCF file - [o] output VCF file - [w] sorting window size 10000 [m] no fail on masked reference inconsistency false [n] no fail on reference inconsistency false [q] quiet false [d] debug false [r] reference FASTA file reference/ref.fa

stats: biallelic no. left trimmed : 0 no. right trimmed : 0 no. left and right trimmed : 0 no. right trimmed and left aligned : 0 no. left aligned : 0

total no. biallelic normalized : 0

multiallelic no. left trimmed : 0 no. right trimmed : 0 no. left and right trimmed : 0 no. right trimmed and left aligned : 0 no. left aligned : 0

total no. multiallelic normalized : 0

total no. variants normalized : 0 total no. variants observed : 0 total no. reference observed : 0

Time elapsed: 0.00s

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tseemann/snippy/issues/410#issuecomment-664930784, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6JNDFSCQT7OD4XER3VWY3R52MRZANCNFSM4PGNE7EQ .

Rathanin-github avatar Jul 30 '20 08:07 Rathanin-github

Hi @Rathanin-github, I also got this issue with a recent re-install of snippy v4.6.0 within the last 2 weeks.

For me, the variants disappear right at this point:

bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - 

I used conda to install snippy, and the dependency [vt] has very recently been updated (Anaconda release log) so I suspected that was the problem. My solution was to downgrade vt to an older, stable version and do a fresh install of snippy:

conda create --name snippy-env
conda install --name snippy-env -c bioconda/label/cf201901 vt==2015.11.10
conda install --name snippy-env -c bioconda snippy

Let me know if this fix is applicable to you, or if you can find a better solution!

This fix was applicable for me. Thanks!

edsonmachado avatar Jul 31 '20 00:07 edsonmachado

Rolling back vt to an older version also did it for me. conda create -n snippy_env snippy vt=0.57721

Koen-vdl avatar Sep 16 '20 12:09 Koen-vdl

Hello. Yes, we are aware of this bug now. Working towards fixing it. Thank you everyone.

andersgs avatar Sep 17 '20 18:09 andersgs

Any updates on this?

snigdhaAgarwal avatar Jul 06 '21 23:07 snigdhaAgarwal