Results 132 issues of Tom White

If the dataset has `sample_family_id`, `sample_paternal_id`, `sample_maternal_id` fields (e.g. from `read_plink`), then we can use those to write family information in `write_plink`. (See https://www.cog-genomics.org/plink/1.9/formats#fam) Otherwise we should set FID to...

IO

We can use the family information from plink (https://www.cog-genomics.org/plink/1.9/formats#fam) to populate the sgkit `parent_id` variable (https://pystatgen.github.io/sgkit/latest/generated/sgkit.variables.parent_id_spec.html).

IO

https://github.com/pystatgen/sgkit/actions/runs/3727751300/jobs/6328533572#step:5:6033 Looks similar to https://github.com/numba/numba/issues/8615

upstream

To check that the sgkit implementation from #975 gets the same results as the reference implementation on real data in https://github.com/ramachandran-lab/genee. There is an initial implementation in https://github.com/tomwhite/sgkit/tree/genee-2022-validation, but it...

This is a first step to implementing #224. I added a separate code path to the Dask one, in `cubed_groupby_agg`, since it is sufficiently different (for example, the combine step...

See https://github.com/sgkit-dev/sgkit/milestone/6

process + tools

This is https://github.com/bigdatagenomics/eggo/issues/30 rebased on https://github.com/bigdatagenomics/eggo/pull/29

The time between calls to `internal_storage.get_call_status` is hardcoded to 1 second (`GET_RESULT_SLEEP_SECS`): https://github.com/lithops-cloud/lithops/blob/b199505955f11602331e4725bf1844f885e2f7f8/lithops/future.py#L201-L203 It would be useful to be able to control this so that it is the same as...

Part of #1324 Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me...

Lithops has a lot of dependencies in the base package, many of which are not needed when running Lithops. There are a few things we could do here: 1. Move...