tskit icon indicating copy to clipboard operation
tskit copied to clipboard

Li Stephens backwards and tests

Open astheeggeggs opened this issue 3 years ago • 4 comments

Description

This PR adds a collection of tests to check haploid and diploid Li and Stephens for forwards, backwards, and viterbi by testing a set of implementations against each other. For the backwards algorithm we also include a tree based implementation in python which uses existing tskit python functions.

All tests: test_haplotype_and_genotype_matching.py, based on testing in test_haplotype_matching.py. All matrix based and tree based LiS python functions are included in separate python files:

Matrix based: Forwards backwards fb_haploid_samples_variants.py fb_haploid_variants_samples.py fb_diploid_samples_variants.py fb_diploid_variants_samples.py Viterbi fb_haploid_samples_variants.py fb_haploid_variants_samples.py fb_diploid_samples_variants.py fb_diploid_variants_samples.py

Tree based: Forwards backwards fb_haploid_variants_samples_tree.py

astheeggeggs avatar Feb 28 '22 13:02 astheeggeggs

Codecov Report

Merging #2137 (12de47a) into main (3631396) will decrease coverage by 11.81%. The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2137       +/-   ##
===========================================
- Coverage   93.37%   81.55%   -11.82%     
===========================================
  Files          27       27               
  Lines       25591    25453      -138     
  Branches     1163     1113       -50     
===========================================
- Hits        23895    20759     -3136     
- Misses       1666     4627     +2961     
- Partials       30       67       +37     
Flag Coverage Δ
c-tests 92.33% <ø> (ø)
lwt-tests 89.05% <ø> (ø)
python-c-tests 72.11% <ø> (ø)
python-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
python/tskit/cli.py 0.00% <0.00%> (-95.94%) :arrow_down:
python/tskit/vcf.py 9.67% <0.00%> (-90.33%) :arrow_down:
python/tskit/text_formats.py 10.25% <0.00%> (-89.75%) :arrow_down:
python/tskit/drawing.py 10.41% <0.00%> (-89.02%) :arrow_down:
python/tskit/combinatorics.py 13.70% <0.00%> (-85.66%) :arrow_down:
python/tskit/stats.py 35.13% <0.00%> (-64.87%) :arrow_down:
python/tskit/trees.py 48.34% <0.00%> (-49.71%) :arrow_down:
python/tskit/util.py 59.56% <0.00%> (-40.44%) :arrow_down:
python/tskit/metadata.py 79.68% <0.00%> (-19.34%) :arrow_down:
python/tskit/tables.py 84.03% <0.00%> (-14.87%) :arrow_down:
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3631396...12de47a. Read the comment docs.

codecov[bot] avatar Feb 28 '22 13:02 codecov[bot]

Thanks @astheeggeggs! I'll take a proper look later.

jeromekelleher avatar Feb 28 '22 14:02 jeromekelleher

We should be able to add lshmm to our requirements list now @astheeggeggs and compare against that rather that using these included files.

Do you want to update this PR or would it be better if I refactored the existing tests to use lshmm? (This may cause significant merge conflicts with your code, though)

jeromekelleher avatar Mar 02 '22 11:03 jeromekelleher

I've updated my PR to to compare against lshmm, and only include tests for matrix against tree for haploid forwards and backwards algorithms.

astheeggeggs avatar Mar 04 '22 16:03 astheeggeggs

Closing this as it's out of date (WIP on getting similar code in)

jeromekelleher avatar Aug 30 '22 13:08 jeromekelleher