Jerome Kelleher comments

Results 998 comments of


                                            Jerome Kelleher

Better logic in remove_buffer when building ancestors, esp with missing data

Sounds good. Any site with all missing data should be ignored I think, as a guiding principle.

Make lmdb optional

I don't see why not, but there would probably be some semi complex plumbing involved. It's only used in the SampleData, I guess? Or are we using it for storing...

Make lmdb optional

It would be nice to do, but it's not a priority. LMDB has served us well apart from a few install headaches.

Make lmdb optional

I don't think the issue is backwards compatibility, more that we're still using lmbd for the current version of AncestorData. Do I understand this correctly @benjeffery?

Make lmdb optional

I think we have enough to do for now, let's leave it as it is.

Make lmdb optional

It's not clear when or if we'll move to Zarr v3 so this isn't going to change much. Easy enough to make a store anyway.

Number of variable sites (not just num_sites)

It's not trivial for the reasons you outline. Zero mutation sites is easy to do though, and we use that somewhere else

Class method to create simple VariantData files for demos

We're going to need something like this for testing anyway - even better if it was just an in-memory store.

Document use of the individual.parent pedigree structure

We should also link to [sgkit's pedigree methods](https://pystatgen.github.io/sgkit/latest/examples/relatedness_tutorial.html) for a compatible way of doing more calculations. We would want a conversion function somewhere, but I'm not sure where that would...

How to save an SgkitSampleData instance, e.g. for running the CLI

I guess another possibility would be to provide an input CSV file (or Zarr) which is formatted with ``position, ancestral_allele, variant_age`` etc, which then populates the arrays appropriately.