Clarify mutation order in Site() object
This is useful to know without having to dive into the order requirements docs. I'm often looking this up to find the inherited state at a node.
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 89.80%. Comparing base (a2a3401) to head (71a34ee).
:warning: Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #3067 +/- ##
=======================================
Coverage 89.80% 89.80%
=======================================
Files 29 29
Lines 31026 31026
Branches 5679 5679
=======================================
Hits 27863 27863
Misses 1777 1777
Partials 1386 1386
| Flag | Coverage Δ | |
|---|---|---|
| c-tests | 86.85% <ø> (ø) |
|
| lwt-tests | 80.38% <ø> (ø) |
|
| python-c-tests | 87.05% <ø> (ø) |
|
| python-tests | 98.84% <ø> (ø) |
|
| python-tests-no-jit | 33.60% <ø> (ø) |
|
| python-tests-numpy1 | 50.18% <ø> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Files with missing lines | Coverage Δ | |
|---|---|---|
| python/tskit/trees.py | 98.88% <ø> (ø) |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Note that there are are loopholes in the mutation data requirements which means that it's possible for this to not actually be true (which is a bug, but not one we can easily resolve now)
Right, I know that the sorting process doesn't necessarily enforce parent/child correctness (and this is noted in the docs, and in https://github.com/tskit-dev/tskit/issues/2732) but they also say:
when there are multiple mutations per site, mutations should be ordered by decreasing time, if known, and parent mutations must occur before their children.. Violations of these sorting requirements are detected at load time.
I guess violations of mutation parent order are not (yet?) detected at load time, so the doc wording should be changed to point out this bug?
Edit - I see this is part of https://github.com/tskit-dev/tskit/issues/2757#issuecomment-1557651165
It's a bit tricky. Maybe you could put in a link to the definitions instead of explaining, so at least it's all in one place?
Maybe we just leave this open until it's (eventually) fixed? It does my head in trying to figure out from the rather involved mutation sorting requirements that the most recent mutations for a site are (should be) at the end of the list. I feel that just needs to be stated simply somewhere, for the non-technical reader.
I think with the enforcement of canonical mutation ordering, this is now true, and this minor doc change ("older mutations will be listed before younger ones at this site") is finally correct and can be merged. Is that right @benjeffery ?