SnpSift
SnpSift copied to clipboard
SnpSift split outputs only the first chromosome
Hi, I have a 3.8Gb VCF file produced by a WGS pipeline mapping on hg38 and mutation calling using Mutect2. The file contains chromosomes chr1 ... chr22, chrX, chrY and chrM in that order.
Running
java -jar SnpSift.jar split $PWD/sample.mnv.hg38.vcf.gz
produces a single file named
sample.mnv.hg38.chr1.vcf
which contains only the first few hundreds of positions in chr1 and exists without any error
I have not managed to replicate the error with a smaller size vcf file but happy to share the full vcf file if necessary.
Thanks in advance for any advise/help
I have a similar problem, though my VCF is 1TB, mapped to hg19, processed with GATK and also includes GL contigs. The sample.1.vcf output file contains ~8,500 variants and in total there are ~42,000,000 variants in the original VCF. Same result if trying the -l option to split every N lines.
I am having the exact same issue. Did you find any solution?
I have the same issue, output only 124M of data and stopped without any error or output in the log (run on the cluster).
I am having the exact same issue. Did you find any solution?
@kmavrommatis any idea why split does not work? It does not give any error either when trying to split a vcf files using the -l
argument.
I too have had the same problem!