sgkit
sgkit copied to clipboard
Scalable genetics toolkit
Hypothesis is finding cases where sgkit and scikit-allel differ. For example: ``` =================================== FAILURES =================================== _______________________________ test_vs_skallel ________________________________ @given(args=ld_prune_args()) # pylint: disable=no-value-for-parameter > @settings(max_examples=50, deadline=None, phases=PHASES_NO_SHRINK) sgkit/tests/test_ld.py:158: _ _ _...
Related to #845. Estimation of inbreeding coefficients from pedigree data does not require computation of the full kinship matrix. [Hamilton and Kerr (2017)](https://pubmed.ncbi.nlm.nih.gov/29260268/) outline an approach to this which works...
:eyes: Some source code analysis tools can help to find opportunities for improving software components. :thought_balloon: I propose to [increase the usage of augmented assignment statements](https://docs.python.org/3/reference/simple_stmts.html#augmented-assignment-statements "Augmented assignment statements") accordingly....
I have a new Mac Mini with an Apple M1 chip. It would be good to be able to run (and develop) sgkit on this architecture.
Over at https://github.com/malariagen/vector-data/discussions/22#discussioncomment-590949, @alimanfoo notes that Ag1000G only releases their data as VCF files, and that it might be nice to have the same data in a PLINK-accessible format. Could...
From #785: ```python import sgkit as sg import sgkit.io.vcf as sgvcf sgvcf.vcf_to_zarr("sgkit/tests/io/vcf/data/sample.vcf.gz", "sample.vcf.gz.zarr") ds = sg.load_dataset("sample.vcf.gz.zarr") sg.save_dataset(ds, "sample2.vcf.gz.zarr", mode="w") ``` prints the warning: ``` SerializationWarning: variable None has data in...
[Pyodide](https://github.com/pyodide/pyodide) uses WebAssembly to run Python in the browser. It has support for a lot of the PyData stack, so I wondered how easy it would be to get sgkit...
[NumPy 1.22](https://numpy.org/devdocs/release/1.22.0-notes.html) no longer supports Python 3.7, so we should consider removing support for it in the next release of sgkit. This would be consistent with https://numpy.org/neps/nep-0029-deprecation_policy.html, which says Python...
Would it be possible to concatenate multiple VCF files into a single Zarr store, preserving sample information, with sgkit? As an example, I got two files, vcf1, where sample1 is...