Shing Hei Zhan issues

Results 16 issues of


Shing Hei Zhan

Calculate IQS to assess imputation performance

Imputation quality score (IQS) is another popular way to measure genotype imputation performance (discussed in #2193). This paper proposed IQS (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2837741/). IQS accounts for chance agreement, whereas overall concordance does...

enhancement

Python API

Add Wattersons_theta

Fixes #1522

enhancement

Python API

statistics

Routines to generate tree sequences deterministically rather than stochastically for testing

In #2384 , I added a test to check that the string representation of a `Variant` object is correctly formatted. The test involves doing regexp matching (using a predefined pattern)...

enhancement

Calculate information-theoretic scores to assess imputation performance

@castedo has suggested using information-theoretic metrics (e.g., mutual information) to assess imputation accuracy. One appealing feature of IT metrics is that they can handle multi-allelic sites easily (as I understand...

enhancement

Python API

Missing verify_provenances_format in test_output_format?

In `test_highlevel.py`, is there supposed to be `self.verify_provenances_format()` in `test_output_format()`? `provenances_file` is being passed to `ts.dump_text()`, but the output provenance file is not being verified subsequently, like the other output...

Python API

Calculate concordance to assess imputation performance

We want to implement functions in the Variant class to calculate various metrics to evaluate genotype imputation performance (discussed #2193). The simplest metric is the **_overall concordance_**, which is the...

enhancement

Python API

Calculate squared correlation to assess imputation performance

Another metric is the squared correlation, which is simply the square of the Pearson correlation coefficient between the allele dosage of the true genotypes and the allele dosage of the...

enhancement

Python API

Shing Hei Zhan

Calculate IQS to assess imputation performance

Add Wattersons_theta

Routines to generate tree sequences deterministically rather than stochastically for testing

Calculate information-theoretic scores to assess imputation performance

Missing verify_provenances_format in test_output_format?

Calculate concordance to assess imputation performance

Calculate squared correlation to assess imputation performance

Add tests for genotype imputation

More flexible interpretation of mutation rate when getting emission probability when using `_tskit.lshmm`

_tskit.lshmm doesn't take NaN in recombination rates