sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Scalable genetics toolkit

Results 243 sgkit issues
Sort by recently updated
recently updated
newest added

The `identity_by_state` method calculates mean probabilities of identity by state (IBS) from call_allele_freqeuncies. The current implementation is fairly efficient when the alleles dimension is small, but not when the alleles...

enhancement

Docs still says "sgkit: Statistical genetics toolkit in Python" https://sgkit-dev.github.io/sgkit/latest/

E.g. we can get 2 "C" values in `ds['variant_allele']`: ```python import sgkit as sg import numpy as np ds = sg.simulate_genotype_call_dataset(n_variant=10, n_sample=4, missing_pct=0, phased=True, seed=1) for i, alleles in enumerate(ds['variant_allele'].values):...

bug

``` ds = sg.simulate_genotype_call_dataset(n_variant=2, n_sample=4, missing_pct=0, phased=True, seed=1) for i, alleles in enumerate(ds['variant_allele'].values): print(f"Site {i}: {alleles}") ``` Alleles are e.g. `[b'T' b'C']` (dtype `|S1`). I was expecting them to be...

bug

It is important to consider genome accessibility when computing rates from genomic data. scikit-allel has options to include an ["accessibility mask"](https://scikit-allel.readthedocs.io/en/stable/stats/diversity.html), a boolean array indicating whether a base is accessible...

Seems sgkit can't be installed on 3.12 due to cbgen: Pinned packages: - python 3.12.* Could not solve for environment specs The following package could not be installed └─ sgkit...

Once vcztools has been [released](https://github.com/sgkit-dev/vcztools/issues/43) we should deprecate sgkit's `write_vcf` and `zarr_to_vcf` [functions for writing VCF](https://sgkit-dev.github.io/sgkit/latest/api.html#vcf-writing).

documentation
IO

I'm using `sgkit.save_dataset` in a notebook. Often the cell gets run several times, but after the first time, the call fails because it won;t overwrite the existing file. It's unclear...

Currently if I go to https://sgkit-dev.github.io, then I get a "no pages are here" GitHub link, so it appears as if the project is defunct. I assume it is possible...

When calling `sgkit.display_genotypes(ds)`, I get a list of zeros and ones, but no clue as to what these correspond to. IMO it would be nice to have an extra column...