sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Scalable genetics toolkit

Results 216 sgkit issues
Sort by recently updated
recently updated
newest added

Hypothesis is finding cases where sgkit and scikit-allel differ. For example: ``` =================================== FAILURES =================================== _______________________________ test_vs_skallel ________________________________ @given(args=ld_prune_args()) # pylint: disable=no-value-for-parameter > @settings(max_examples=50, deadline=None, phases=PHASES_NO_SHRINK) sgkit/tests/test_ld.py:158: _ _ _...

bug

Related to #845. Estimation of inbreeding coefficients from pedigree data does not require computation of the full kinship matrix. [Hamilton and Kerr (2017)](https://pubmed.ncbi.nlm.nih.gov/29260268/) outline an approach to this which works...

:eyes: Some source code analysis tools can help to find opportunities for improving software components. :thought_balloon: I propose to [increase the usage of augmented assignment statements](https://docs.python.org/3/reference/simple_stmts.html#augmented-assignment-statements "Augmented assignment statements") accordingly....

I have a new Mac Mini with an Apple M1 chip. It would be good to be able to run (and develop) sgkit on this architecture.

process + tools

Over at https://github.com/malariagen/vector-data/discussions/22#discussioncomment-590949, @alimanfoo notes that Ag1000G only releases their data as VCF files, and that it might be nice to have the same data in a PLINK-accessible format. Could...

IO

From #785: ```python import sgkit as sg import sgkit.io.vcf as sgvcf sgvcf.vcf_to_zarr("sgkit/tests/io/vcf/data/sample.vcf.gz", "sample.vcf.gz.zarr") ds = sg.load_dataset("sample.vcf.gz.zarr") sg.save_dataset(ds, "sample2.vcf.gz.zarr", mode="w") ``` prints the warning: ``` SerializationWarning: variable None has data in...

bug
upstream

[Pyodide](https://github.com/pyodide/pyodide) uses WebAssembly to run Python in the browser. It has support for a lot of the PyData stack, so I wondered how easy it would be to get sgkit...

process + tools

[NumPy 1.22](https://numpy.org/devdocs/release/1.22.0-notes.html) no longer supports Python 3.7, so we should consider removing support for it in the next release of sgkit. This would be consistent with https://numpy.org/neps/nep-0029-deprecation_policy.html, which says Python...

process + tools
upstream

Would it be possible to concatenate multiple VCF files into a single Zarr store, preserving sample information, with sgkit? As an example, I got two files, vcf1, where sample1 is...

upstream