htsjdk icon indicating copy to clipboard operation
htsjdk copied to clipboard

Can't read or write gzipped BCFs

Open cmnbroad opened this issue 7 years ago • 1 comments

htsjdk can neither read nor write BCFs that are block gzipped, even though that seems to be the recommended state based on the BCF spec. It mostly describes BCF as series of gzipped blocks, but also says "Compression of a BCF file is recommended but not required", which to me introduces some ambiguity.

FWIW, bcftools "convert" has options for both compressed and uncompressed bcfs; the uncompressed option appears to produce a bcf that is not gzipped at all.

cmnbroad avatar Jul 26 '17 19:07 cmnbroad

FYI: in VCF version 4.4, the following line (which has been in the VCF specifications since at least 2013) will be removed from the BCF portion of the specification:

Note that currently the GATK generates raw BCF2 files (not BGZF compression at all) but this will change in the near future.

d-cameron avatar Oct 17 '22 06:10 d-cameron