sgkit
sgkit copied to clipboard
Scalable genetics toolkit
Hi I realize that the `save_dataset` function doesn't return the delayed object. For instance, if ran with `compute=False`: https://github.com/sgkit-dev/sgkit/blob/eef911e87d1aad47cb139170bcda1b82f220114f/sgkit/io/dataset.py#L71 running `save_dataset` with `compute=False` has use cases for example - visualizing...
Hi, I am working with array datasets and wish to concatenate samples across multiple zarr stores (>100). Since these are genotyping array datasets, they only differ in sample dimension. Everything...
When working on a reasonably large dataset (7TiB Zarr store), I noticed that diversity calculations, in particular windowed ones, generate lots of unmanaged memory. The `call_genotype` data portion is 280GiB...
Part of #908 This is still a draft as there are a couple of things that might need more attention before merging.
sgkit-allel has functionality to [calculate Watterson's theta](https://scikit-allel.readthedocs.io/en/stable/stats/diversity.html#allel.watterson_theta) but in sgkit it is calculated implicitly in [Tajima's D function](https://github.com/sgkit-dev/sgkit/blob/54c8abe49a91fab43cc905d1ed5190397c5af8b7/sgkit/stats/popgen.py#L476). For some applications direct access to Watterson's theta would be of practical...
For tools using the CLI this amount of delay feels excessive. Around 1s of this time is performing imports. Here's the import flame graph: (0.1s on `xarray.tutorial`?!?)  I assume...
Currently it is [`Development Status :: 3 - Alpha`](https://github.com/pystatgen/sgkit/blob/main/setup.cfg#L13), should we change it to [Beta or Production/Stable](https://pypi.org/classifiers/), or perhaps remove entirely?
While trying to insall sgkit in Guix I had an error with the three submodules not being found. This can be fixed by specifying the submodules in the pyproject.toml here...
Hi Link ```https://storage.googleapis.com/sgkit-data/tutorial/1kg.vcf.bgz``` referenced in [gwas_tutorial](https://sgkit-dev.github.io/sgkit/latest/examples/gwas_tutorial.html) is no longer accessible. Thanks
Hi! Thanks for creating sgkit. Very useful package. We are using it for performing calculations on a dataset that was converted from bgen. For reading, our code uses something like...