snippy icon indicating copy to clipboard operation
snippy copied to clipboard

snippy-multi - bed targets file is not open

Open Silverfoxcome opened this issue 1 year ago • 5 comments

Hi! Thanks for this useful program!

This is the first time I use snippy-multi. This is my input.tab file:

PNG84A	/media/koala/Main/mishell/snippy-multi/t1/210.2442.fna
G272	/media/koala/Main/mishell/snippy-multi/t1/210.2712.fna

My target proteins in my virulent_proteins_targets.bed (tab file)

CP003904    806853  808877  fliD
CP003904    445347  448223  23Srna  

The commands that I ran were: snippy-multi input.tab --ref ncbi_hp_26695.gb --cpus 2 --targets virulent_protein_targets.bed > runme2.sh Then:

Reading: input.tab
Generating output commands for 2 isolates
Done.

And finally: bash runme2.sh

While it was running I noticed this message:

[20:24:12] Running: freebayes-parallel reference/ref.txt 2 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf --targets '/media/koala/Main/mishell/snippy-multi/t1/PNG84A/virulent_protein_targets.bed' -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log [20:24:12] Error running command, check PNG84A/snps.log

It kept running but ended with it showed this message:

[samclip] Total SAM records 128955, removed 17882, allowed 7721, passed 111073 [samclip] Header contained 3 lines [samclip] Done. [20:24:41] Running: samtools index snps.bam 2>> snps.log [20:24:41] Running: fasta_generate_regions.py reference/ref.fa.fai 565248 > reference/ref.txt 2>> snps.log [20:24:41] Running: freebayes-parallel reference/ref.txt 2 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf --targets '/media/koala/Main/mishell/snippy-multi/t1/G272/virulent_protein_targets.bed' -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log [20:24:41] Error running command, check G272/snps.log

I wanted to go and check that G272/snps.log but I waited until the program ended.

The program kept running:

This is snippy-core 4.6.0 Obtained from http://github.com/tseemann/snippy Enabling bundled tools for linux Found any2fasta - /home/koala/biotools/snippy/binaries/noarch/any2fasta Found samtools - /home/koala/anaconda3/bin/samtools Found minimap2 - /home/koala/anaconda3/bin/minimap2 Found bedtools - /home/koala/anaconda3/bin/bedtools Found snp-sites - /home/koala/biotools/bin/snp-sites Saving reference FASTA: core.ref.fa This is any2fasta 0.4.2 Opening 'G272/ref.fa' Detected FASTA format Read 27800 lines from 'G272/ref.fa' Wrote 1 sequences from FASTA file. Processed 1 files. Done. Loaded 1 sequences totalling 1667892 bp. Will mask 0 regions totalling 0 bp ~ 0.00% ERROR: Could not find .aligned.fa/.vcf in G272

And it stops there. The previous message before the end said [20:24:41] Error running command, check G272/snps.log

I wen to check that file, there was a warning:

### snpEff build -c reference/snpeff.config -dataDir . -gff3 ref

WARNING: All frames are zero! This seems rather odd, please check that 'frame' information in your 'genes' file is accurate.

But it kept running until this point:

### freebayes-parallel reference/ref.txt 2 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf   --targets '/media/koala/Main/mishell/snippy-multi/t1/G272/virulent_protein_targets.bed' -f reference/ref.fa snps.bam > snps.raw.vcf

bed targets file is not open
bed targets file is not open
bed targets file is not open

What does it mean by that? I'm uploading here my bed file (changing it to .txt because github doesn't support that type of file (bed): virulent_protein_targets.txt .

I will be very grateful if you could help me with this issue.

Thanks a lot in advance :)

Silverfoxcome avatar Jul 12 '22 02:07 Silverfoxcome

In the issue #312 the user has the same problem as me. I will try with the absolute path to my targets :)

Silverfoxcome avatar Jul 12 '22 02:07 Silverfoxcome

I get and error again but this time the program run a lot more. I'm attaching a copy of my log file here for more details snippy.log

While running, the program gave me a lot of lines of this same message:

[bcf_ordered_writer.cpp:163 write] Might not be sorted for window size 10000 at current record CP003904:806930 < 1088758 (1098758 [last record] - 10000), please increase window size to at least 291829.

Can anyone tell me what it means please?

Then, like in the previous run, the program told me to check G272/snps.log:

[21:24:48] Running: snpEff ann -noLog -noStats -no-downstream -no-upstream -no-utr -c reference/snpeff.config -dataDir . ref snps.filt.vcf > snps.vcf 2>> snps.log [21:24:56] Running: /home/koala/biotools/snippy/bin/snippy-vcf_to_tab --gff reference/ref.gff --ref reference/ref.fa --vcf snps.vcf > snps.tab 2>> snps.log [21:24:59] Running: /home/koala/biotools/snippy/bin/snippy-vcf_extract_subs snps.filt.vcf > snps.subs.vcf 2>> snps.log [21:25:00] Running: bcftools convert -Oz -o snps.vcf.gz snps.vcf 2>> snps.log [21:25:01] Running: bcftools index -f snps.vcf.gz 2>> snps.log [21:25:01] Error running command, check G272/snps.log

Then it kept going:

This is snippy-core 4.6.0 Obtained from http://github.com/tseemann/snippy Enabling bundled tools for linux Found any2fasta - /home/koala/biotools/snippy/binaries/noarch/any2fasta Found samtools - /home/koala/anaconda3/bin/samtools Found minimap2 - /home/koala/anaconda3/bin/minimap2 Found bedtools - /home/koala/anaconda3/bin/bedtools Found snp-sites - /home/koala/biotools/bin/snp-sites Saving reference FASTA: core.ref.fa This is any2fasta 0.4.2 Opening 'G272/ref.fa' Detected FASTA format Read 27800 lines from 'G272/ref.fa' Wrote 1 sequences from FASTA file. Processed 1 files. Done. Loaded 1 sequences totalling 1667892 bp. Will mask 0 regions totalling 0 bp ~ 0.00% ERROR: Could not find .aligned.fa/.vcf in G272

And ends with that error.

I checked the G272/snps.log and it says: image

It seems something goes wrong when bcftools wants to index my vcf file.

Perhaps there is a problem with my bed file. Maybe because the positions are not in order? image

I'll correct that mistake and run the program again :) Thanks a lot in advance!

Silverfoxcome avatar Jul 12 '22 02:07 Silverfoxcome

Hi!!!

I get an error again. I'm attaching a copy of my log file here for more details snippy3.log

Like in the previous run, the program gave me several times a message like this :

[bcf_ordered_writer.cpp:163 write] Might not be sorted for window size 10000 at current record CP003904:806930 < 1088758 (1098758 [last record] - 10000), please increase window size to at least 291829.

Then. it seems there was an error here:

[21:46:42] Running: bcftools index -f snps.vcf.gz 2>> snps.log [21:46:42] Error running command, check PNG84A/snps.log

Same with the other isolate:

[21:47:48] Running: bcftools index -f snps.vcf.gz 2>> snps.log [21:47:48] Error running command, check G272/snps.log

It seems something goes wrong when bcftools wants to sort and index my vcf file.

It ends with the message:

Loaded 1 sequences totalling 1667892 bp. Will mask 0 regions totalling 0 bp ~ 0.00% ERROR: Could not find .aligned.fa/.vcf in G272

I went to check the G272/snps.log

image

It seems to be linked to when bcftools wants to sort and index my vcf file and something in my bed file makes it fail.

I seems like a problem in my bed file but I don't understand what is wrong with it. image

I'm attaching it here (in .txt): virulent_protein_targets_sorted.txt

Please, I'll be very grateful for any help to understand what can I do to make this work! Thanks a lot in advance!

Silverfoxcome avatar Jul 12 '22 03:07 Silverfoxcome

Hi!!

I ran the command without my targets in bed format (virulent_protein_targets_sorted.bed):

snippy-multi file4.txt --ref ncbi_hp_26695.gb --cpus 2 > runme.sh

bash runme.sh

And it seems that all worked well. I'm attaching a copy of my log file here for more details snippy4.log

So it seems the problem was my virulent_protein_targets_sorted.bed file.

It seems to be linked to when bcftools wants to sort and index my vcf file and something in my bed file makes it fail.

I think these warnings might also have something to do with it:

[bcf_ordered_writer.cpp:163 write] Might not be sorted for window size 10000 at current record CP003904:806930 < 1088758 (1098758 [last record] - 10000), please increase window size to at least 291829.
[bcf_ordered_writer.cpp:163 write] Might not be sorted for window size 10000 at current record CP003904:806942 < 1088758 (1098758 [last record] - 10000), please increase window size to at least 291817.
...
...
[bcf_ordered_writer.cpp:163 write] Might not be sorted for window size 10000 at current record CP003904:808822 < 1088758 (1098758 [last record] - 10000), please increase window size to at least 289937.
[bcf_ordered_writer.cpp:163 write] Might not be sorted for window size 10000 at current record CP003904:808840 < 1088758 (1098758 [last record] - 10000), please increase window size to at least 289919.

I'm attaching it here, in .txt format.

Please, I'll be very grateful if someone could help me understand what is wrong with my bed file. virulent_protein_targets_sorted.txt

Thanks a lot in advance for any advice!!!

Silverfoxcome avatar Jul 12 '22 03:07 Silverfoxcome

Dear Silverfoxcome, it's a little too late, but I think that the bed file has to look like this:

CP003904:806853-808877
CP003904:445347-448223

Best wishes

mlarjim avatar Feb 03 '23 08:02 mlarjim