sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Scalable genetics toolkit

Results 216 sgkit issues
Sort by recently updated
recently updated
newest added

See https://github.com/pystatgen/sgkit/runs/7745453482?check_suite_focus=true The problem is that cyvcf2 wheels for macOS Python 3.10 are only available from version [0.30.16](https://pypi.org/project/cyvcf2/0.30.16/#files), but that is built with NumPy 1.23 which is incompatible with Numba....

upstream

HLA contigs have colons and dashes in their names and I believe this isn't addressed by the `vcf_to_zarr` implementation, specifically the `get_region_start` function: https://github.com/pystatgen/sgkit/blob/d08feba59415fa502150ad1d052e5377cdb83a94/sgkit/io/vcf/vcf_reader.py#L94-L100 This causes a "too many values...

Are these duplicate variables? `call_dosage` seems to be more explicit with ndim=2 and the associated `call_dosage_mask` but isn't currently being used. Variable `dosage` doesn't specify dimensions and is used for...

question

Edit: related to #371 I've recently started experimenting with sgkit on a [SLURM cluster](https://jobqueue.dask.org/en/latest/generated/dask_jobqueue.SLURMCluster.html) which is working well with the exception of methods using `guvectorize` with `cache=True`. Calling these functions...

process + tools

Related: https://github.com/pystatgen/sgkit/blob/84a5f6e2872a2d2882d87590d68b378f79ce8600/setup.cfg#L45-L57

process + tools

Implement the VanRaden genomic relationship matrix ([VanRaden 2008](https://www.sciencedirect.com/science/article/pii/S0022030208709901)). This is typically calculated from dosages for biallelic markers and has been generalized to autopolyploids (fixed ploidy). One consideration is the specification...

The MRC IEU at Bristol has a [specification](https://github.com/MRCIEU/gwas-vcf-specification) for storing GWAS summary statistics in a VCF file. While I certainly have mixed feelings about using VCF files as a container...

Hi, I am encountering an error when calling `gwas_linear_regression` on a transposed `Dataset`. Given that all the dimensions are labelled in a standardized way, it seems to me that this...