sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Scalable genetics toolkit

Results 216 sgkit issues
Sort by recently updated
recently updated
newest added

Suggested by @benjeffery here: https://github.com/pystatgen/sgkit/pull/1054#pullrequestreview-1360645992 Do the same for `filters` and `contig_lengths`.

IO

#1043 shows that we should test with a processed-based dask cluster. I've tried this by adding `client = dask.distributed.Client(n_workers=1, threads_per_worker=1)` to `conftest.py` but I get segfaults in workers. Attaching GDB...

As described in #1190, we are currently returning int FILL values (-2) rather than missing data (-1) for INFO fields, and (I think) FORMAT fields as well. I'm not sure...

bug

We [decided](https://github.com/pystatgen/sgkit/discussions/1166) to rename the organization to `sgkit-dev` and to expand `sgkit` to "Scalable genetics toolkit". We should make the corresponding changes and make sure all project resources work.

documentation
process + tools

Does xarray mandate that Zarr chunk sizes must be the same for all variables with a given dimension? This is quite a hard restriction if so, as we can have...

@jeromekelleher to bring over https://github.com/pystatgen/sgkit-publication/pull/94, motivated by https://github.com/pystatgen/sgkit/issues/1168. Related to https://github.com/pystatgen/sgkit/issues/1130.

IO

I'm assuming that variant positions must be 1 or greater in an sgkit dataset. Is this part of the spec, or enforced anywhere? I looked through the docs and couldn't...

Type: Bug
Affects: Alpha

- Improves documentation of input and output for `sgkit.genee` - The *Example* for the `sgkit.genee` doc string is not a final version. It would need a better treatment of the...

The current implementation of `sgkit.genee` only covers one special case of genee. The method does not perform the regularization of genee. In the current implementation, it is expected that regularization...