Support for vcf.gz files
It seems Jasmine does not support vcf.gz or bcf files:
Warning: input.vcf.gz ends with .gz, but (b)gzipped VCFs are not accepted
Exception in thread "main" java.lang.Exception: input.vcf.gz is a gzipped file, but only unzipped VCFs are accepted
Since it is quite a standard format, would it be possible for Jasmine to support both vcf.gz and bcf files?
thanks,
Hi,
Thanks for the suggestion! Unfortunately, adding support for vcf.gz and .bcf files would require fairly extensive software changes and so there are no plans in the near future to do so since the majority of SV calling software produces unzipped VCF files.
Melanie
I understand that it might a bit of work, but maybe you could use an existing library to read the VCF files, like htsjdk (developed by the Broad Institute).
At this point it has only partial support for VCF (VCFv4.3 can be read but not written and there is no support for BCFv2.2), but at least you can read and write VCFv4.2 (both text and gz versions). And when they implement the rest Jasmine will automatically support them!
I would like to bump this. Unzipping VCFs for large datasets is highly undesirable in terms of storage costs. Most bioinformatic tools are able to operate off of either compressed VCFs or some other lightweight binary format, which limits the reusability of the unzipped VCFs. Compression or binary support would be very much appreciated!