sgkit
sgkit copied to clipboard
Scalable genetics toolkit
Suggested by @benjeffery here: https://github.com/pystatgen/sgkit/pull/1054#pullrequestreview-1360645992 Do the same for `filters` and `contig_lengths`.
#1043 shows that we should test with a processed-based dask cluster. I've tried this by adding `client = dask.distributed.Client(n_workers=1, threads_per_worker=1)` to `conftest.py` but I get segfaults in workers. Attaching GDB...
As described in #1190, we are currently returning int FILL values (-2) rather than missing data (-1) for INFO fields, and (I think) FORMAT fields as well. I'm not sure...
We [decided](https://github.com/pystatgen/sgkit/discussions/1166) to rename the organization to `sgkit-dev` and to expand `sgkit` to "Scalable genetics toolkit". We should make the corresponding changes and make sure all project resources work.
Does xarray mandate that Zarr chunk sizes must be the same for all variables with a given dimension? This is quite a hard restriction if so, as we can have...
@jeromekelleher to bring over https://github.com/pystatgen/sgkit-publication/pull/94, motivated by https://github.com/pystatgen/sgkit/issues/1168. Related to https://github.com/pystatgen/sgkit/issues/1130.
I'm assuming that variant positions must be 1 or greater in an sgkit dataset. Is this part of the spec, or enforced anywhere? I looked through the docs and couldn't...
- Improves documentation of input and output for `sgkit.genee` - The *Example* for the `sgkit.genee` doc string is not a final version. It would need a better treatment of the...
The current implementation of `sgkit.genee` only covers one special case of genee. The method does not perform the regularization of genee. In the current implementation, it is expected that regularization...