Jerome Kelleher
Jerome Kelleher
Good question - maybe it's something to do with ragged multi-character alleles (e.g. indels)?
Thanks, that seems to be working quite well now.
It's actually chugging away pretty well, so I'm going to leave it.
Good questions @tnguyengel! Any thoughts @tomwhite?
Discussion here: https://github.com/tskit-dev/tskit/discussions/2711
Need to be careful about potential zero and one-based differences here. What do we currently do with plink-like pedigrees @timothymillar ?
Thanks @aktech, I'll be in touch!
This is tricky, I don't think we thought much about interoperability with Plink when doing the pedigree encoding @timothymillar ?
The point we're illustrating here is the power of open and extensible formats. Previously we had to convert VCFs to our own zarr formats which was time-consuming and tedious. Now...
Have you got an example where the we don't do so well on compression @elswob? I'm guessing it's some particular INFO or FORMAT field that's not being dealt with well.