Shing Hei Zhan
Shing Hei Zhan
Imputation quality score (IQS) is another popular way to measure genotype imputation performance (discussed in #2193). This paper proposed IQS (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2837741/). IQS accounts for chance agreement, whereas overall concordance does...
In #2384 , I added a test to check that the string representation of a `Variant` object is correctly formatted. The test involves doing regexp matching (using a predefined pattern)...
@castedo has suggested using information-theoretic metrics (e.g., mutual information) to assess imputation accuracy. One appealing feature of IT metrics is that they can handle multi-allelic sites easily (as I understand...
In `test_highlevel.py`, is there supposed to be `self.verify_provenances_format()` in `test_output_format()`? `provenances_file` is being passed to `ts.dump_text()`, but the output provenance file is not being verified subsequently, like the other output...
We want to implement functions in the Variant class to calculate various metrics to evaluate genotype imputation performance (discussed #2193). The simplest metric is the **_overall concordance_**, which is the...
Another metric is the squared correlation, which is simply the square of the Pearson correlation coefficient between the allele dosage of the true genotypes and the allele dosage of the...
## Description Add toy examples for testing imputation. Results obtained by running BEAGLE 4.1 are stored for comparison. Fixes #2802 # PR Checklist: - [x] Implement BEAGLE's interpolation-style imputation algorithm...
Currently, the emission probability for calculating the forward, backward, and Viterbi matrices is defined as follows: ``` p_e = mu; if (is_match) { p_e = 1 - (num_alleles - 1)...
I was trying to run `_tskit.lshmm` using recombination rates obtained by calling `msprime.RateMap.get_rates()`. `_tskit.lshmm` doesn't really handle NaN. I recall encountering this before. Setting the NaN values to zero bypassed...