bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

bcftools view -S data size vcf

Open hitwbt opened this issue 1 year ago • 1 comments

Hi, may I ask why when I use bcftools view -S 1.txt FAM596.vcf.gz -Oz > NA19919.vcf.gz command to filter the vcf of NA19919 samples, the output single sample vcf (1.16G) is bigger than the original three sample vcf (1.13G), shouldn't it be equal to one-third of the FAM596.vcf.gz?

hitwbt avatar Apr 25 '24 08:04 hitwbt

Possibly, it depends how big are the mandatory columns (CHROM-INFO) compared to the FORMAT fields. Why don't you look in the output file and compare it with the input file? Also it matters if you are comparing uncompressed or compressed files - compression can decrease the size differences when the data is easily compressible, i.e. has low information entropy.

pd3 avatar Apr 27 '24 18:04 pd3