sgkit issues

Results 216 sgkit issues

Sort by recently updated

Mean of windowed popgen stats

Currently the windowed aggregation of statistics in `popgen.py` is hard-coded to use `np.sum` [[1](https://github.com/pystatgen/sgkit/blob/main/sgkit/stats/popgen.py#L108), [2](https://github.com/pystatgen/sgkit/blob/main/sgkit/stats/popgen.py#L275), [3](https://github.com/pystatgen/sgkit/blob/main/sgkit/stats/popgen.py#L566), [4](https://github.com/pystatgen/sgkit/blob/main/sgkit/stats/popgen.py#L1080)]. Would it be possible to make this aggregation optional or have a `span_normalise`...

timothymillar

Provide progress bar for `vcf_to_zarr`

When converting large VCF files with `vcf_to_zarr`, it would be nice to display a progress bar to users.

hammer

Canine GWAS example notebook

@eric-czech has a nice collection of notebooks exploring real, open canine data, including phenotypes, at https://github.com/related-sciences/gwas-analysis/tree/master/notebooks/organism/canine. It might make sense to port these notebooks to `sgkit`.

hammer

documentation

Enable indexing on datasets by default

It would make it easier for new users if datasets were indexed (in the variants and samples dimensions) by default, since it makes them easier to explore, as discussed in...

tomwhite

data representation

PyData APIs notebook

Given the philosophy of `sgkit` to rely on PyData APIs when possible, I thought it might be nice to have a notebook that spends time introducing and exploring PyData projects...

hammer

documentation

Track upstream improvements to concat and rechunk issue

This is an issue to track fixes/workarounds to the problem described in https://github.com/dask/dask/issues/6745, and (temporarily) addressed in this project via #324. https://github.com/dask/dask/issues/6745 is a possible fix in Dask

tomwhite

upstream

Follow up on guvectorize use of fastmath

I had to get rid of `fastmath=True` param from `guvectorize` due to [this issue](https://github.com/numba/numba/issues/2919) in Numba. _Originally posted by @aktech in https://github.com/pystatgen/sgkit/issues/306#issuecomment-711432004_

hammer

upstream

Add PCA usage to user guide

After https://github.com/pystatgen/sgkit/pull/278 and https://github.com/pystatgen/sgkit/pull/262 are in, it would be helpful to add a section to the user guide on PCA. This should cover what's mentioned in https://github.com/pystatgen/sgkit/issues/95.

eric-czech

documentation

Clean up docs

While working on #504 I noticed a few things that make the docs look cleaner, in my opinion. I'm curious to hear from others. - [ ] Remove `Indices and...

hammer

documentation

Migrate Discourse posts to GitHub Discussions

A reminder to myself to do this some time soon so we can shut down Discourse. - [ ] [Top posts](https://discourse.pystatgen.org/latest?order=views) - [ ] Update [contributing.rst](https://github.com/pystatgen/sgkit/blob/main/docs/contributing.rst) to point to GitHub...

hammer

sgkit
sgkit copied to clipboard

Metadata

Mean of windowed popgen stats

Provide progress bar for `vcf_to_zarr`

Canine GWAS example notebook

Enable indexing on datasets by default

PyData APIs notebook

Track upstream improvements to concat and rechunk issue

Follow up on guvectorize use of fastmath

Add PCA usage to user guide

Clean up docs

Migrate Discourse posts to GitHub Discussions

← Metadata

Owner

Metadata

sgkit sgkit copied to clipboard

Metadata

← Metadata

Owner

Metadata

sgkit
sgkit copied to clipboard