Tom White
Tom White
Good idea - I haven't really looked at the pedigree stuff in VCF before.
Thanks @timothymillar - that's a good suggestion.
Odd - I don't get that. I'm using Chrome on Mac.
The current implementation writes up to 3 decimal places, which I think is sufficient for most cases. I'm not planning on changing this at the moment - I think it's...
This should be fixed by Numba 0.57 which will fix https://github.com/numba/numba/pull/8620
Hi @elswob, I did some work on this last year and found that on 1000 genomes chr22 (GT sparsity = 3.7%) I could store the extra fields at 46% of...
BTW there's also a summary on https://github.com/tomwhite/ga4gh-variant-comparison
> I bet we're storing `call_DS` as a 32 or 64 bit float which isn't compressing very well. It will be a 32-bit float if it's a VCF float. @elswob...
@elswob thanks for the screenshot. The fact that `max_alt_alleles_seen` is 1 means you could set `max_alt_alleles` to 1 (for this dataset at least). Might be worth re-running to see what...
Thanks. Uncompressed, the DS field would take 406 MB, so that's a 6X compression factor with Zarr. I'm still puzzled why gzipped VCF (I assume that's what you are comparing...