Results 506 comments of Tom White

Good idea - I haven't really looked at the pedigree stuff in VCF before.

Thanks @timothymillar - that's a good suggestion.

Odd - I don't get that. I'm using Chrome on Mac.

The current implementation writes up to 3 decimal places, which I think is sufficient for most cases. I'm not planning on changing this at the moment - I think it's...

This should be fixed by Numba 0.57 which will fix https://github.com/numba/numba/pull/8620

Hi @elswob, I did some work on this last year and found that on 1000 genomes chr22 (GT sparsity = 3.7%) I could store the extra fields at 46% of...

BTW there's also a summary on https://github.com/tomwhite/ga4gh-variant-comparison

> I bet we're storing `call_DS` as a 32 or 64 bit float which isn't compressing very well. It will be a 32-bit float if it's a VCF float. @elswob...

@elswob thanks for the screenshot. The fact that `max_alt_alleles_seen` is 1 means you could set `max_alt_alleles` to 1 (for this dataset at least). Might be worth re-running to see what...

Thanks. Uncompressed, the DS field would take 406 MB, so that's a 6X compression factor with Zarr. I'm still puzzled why gzipped VCF (I assume that's what you are comparing...