sgkit
sgkit copied to clipboard
Scalable genetics toolkit
Currently the windowed aggregation of statistics in `popgen.py` is hard-coded to use `np.sum` [[1](https://github.com/pystatgen/sgkit/blob/main/sgkit/stats/popgen.py#L108), [2](https://github.com/pystatgen/sgkit/blob/main/sgkit/stats/popgen.py#L275), [3](https://github.com/pystatgen/sgkit/blob/main/sgkit/stats/popgen.py#L566), [4](https://github.com/pystatgen/sgkit/blob/main/sgkit/stats/popgen.py#L1080)]. Would it be possible to make this aggregation optional or have a `span_normalise`...
When converting large VCF files with `vcf_to_zarr`, it would be nice to display a progress bar to users.
@eric-czech has a nice collection of notebooks exploring real, open canine data, including phenotypes, at https://github.com/related-sciences/gwas-analysis/tree/master/notebooks/organism/canine. It might make sense to port these notebooks to `sgkit`.
It would make it easier for new users if datasets were indexed (in the variants and samples dimensions) by default, since it makes them easier to explore, as discussed in...
Given the philosophy of `sgkit` to rely on PyData APIs when possible, I thought it might be nice to have a notebook that spends time introducing and exploring PyData projects...
This is an issue to track fixes/workarounds to the problem described in https://github.com/dask/dask/issues/6745, and (temporarily) addressed in this project via #324. https://github.com/dask/dask/issues/6745 is a possible fix in Dask
I had to get rid of `fastmath=True` param from `guvectorize` due to [this issue](https://github.com/numba/numba/issues/2919) in Numba. _Originally posted by @aktech in https://github.com/pystatgen/sgkit/issues/306#issuecomment-711432004_
After https://github.com/pystatgen/sgkit/pull/278 and https://github.com/pystatgen/sgkit/pull/262 are in, it would be helpful to add a section to the user guide on PCA. This should cover what's mentioned in https://github.com/pystatgen/sgkit/issues/95.
While working on #504 I noticed a few things that make the docs look cleaner, in my opinion. I'm curious to hear from others. - [ ] Remove `Indices and...
A reminder to myself to do this some time soon so we can shut down Discourse. - [ ] [Top posts](https://discourse.pystatgen.org/latest?order=views) - [ ] Update [contributing.rst](https://github.com/pystatgen/sgkit/blob/main/docs/contributing.rst) to point to GitHub...