tskit
tskit copied to clipboard
Population-scale genomics
Python 3.12 has been released. The good news is that tskit builds and almost all tests pass. The failing ones are those that depend on `numba` via `lshmm` as `numba`...
Opening this up for discussion. As mentioned by @jeromekelleher at https://github.com/tskit-dev/tskit/pull/2811#issuecomment-1663778875 it has been a long time since these legacy formats were used. When dropping them we should add a...
After #2786 we don't actually use tsk_diff_iter_t at all in the library. As it's not part of the public API (and it causes some annoying problems internally, e.g. [here](https://github.com/tskit-dev/tskit/blob/f7ba5489ae9fa7bede54ad856f181c85f8759f6e/c/tskit/trees.c#L447)) I...
We should be able to substantially simplify the stats API algorithms by using the tsk_tree_position_t class. This should be done before we generalise to windows that are not [0, L)...
Once #2782 is implemented we can easily support threading along the genome by following the approach for divergence matrix in #2736.
I think we can rephrase at least ``genetic_relatedness`` (aka eGRM) in terms of ``divergence_matrix``, which should substantially improve performance (although waiting for #2779 which is needed for decent site-mode performance)....
Currently the stats API requires that we cover the entire genome with windows, which is restrictive. In particular, it prevents us from parallelising along the genome in a simple way....
Following up on #2736, we need to document the function. Note that I left the old partially implemented version of divergence matrix as a commented out block here as it...
Currently the divergence matrix supports a list of ``samples``. It would also be useful to support ``individuals`` as a mutually exclusive option. Initially we can implement this by post-processing the...
this is a common thing to want to do