vargeno
vargeno copied to clipboard
Empty .dict files
I've been trying to run vargeno on non-human data and running into problems at the indexing stage. No error is reported during the process, but the .dict
files are both empty, and so the genotyping step fails.
I'm working with a fragmentary reference assembly of a grasshopper genome, so both the bioinformatic and biological properties of the data are not at all what vargeno was designed for.
Do you have any tips for troubleshooting? Attached (here) is a sample of the .vcf
input. Since my data is not human data and I'm obviously not working with dbSNP it's a little unclear how to properly format this file. Variants were detected with freebayes in the first instance.
Here is the terminal output:
$ vargeno index packardii.sub.fa snp.vcf test
[BloomFilter constructBfFromGenomeseq] bit vector: 755356701/9600000000
[BloomFilter constructBfFromGenomeseq] lite bit vector: 988176227/18400000000
[BloomFilter constructBfFromVCF] bit vector: 0/1120000000
SNP Dictionary
Total k-mers: 21626752
Unambig k-mers: 20575340
Ambig unique k-mers: 296062
Ambig total k-mers: 1051412
Ref Dictionary
Total k-mers: 1305711431
Unambig k-mers: 1130124620
Ambig unique k-mers: 36489256
Ambig total k-mers: 175586811
And here are the output files:
-rw-r--r-- 1 oliver users 12348187 Feb 5 11:42 test.chrlens
-rw-r--r-- 1 oliver users 1200000008 Feb 5 10:43 test.ref.bf
-rw-r--r-- 1 oliver users 2300000008 Feb 5 10:43 test.ref.bf.lite.bf
-rw-r--r-- 1 oliver users 0 Feb 5 14:47 test.ref.dict
-rw-r--r-- 1 oliver users 140000008 Feb 5 11:41 test.snp.bf
-rw-r--r-- 1 oliver users 0 Feb 5 11:42 test.snp.dict
All of the test files (in /vargeno/test
) run fine and reproduce the provided output files. I'm running on Ubuntu 18.04.5 in a conda environment with the following packages:
# packages in environment at /home/oliver/miniconda2/envs/vargeno:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
bioawk 1.0 hed695b0_5 bioconda
libgcc-ng 9.3.0 h2828fa1_18 conda-forge
libgomp 9.3.0 h2828fa1_18 conda-forge
libstdcxx-ng 9.3.0 h6de172a_18 conda-forge
seqtk 1.3 hed695b0_2 bioconda
vargeno 1.0.3 hc9558a2_1 bioconda
zlib 1.2.11 h516909a_1010 conda-forge