dose icon indicating copy to clipboard operation
dose copied to clipboard

Is our point mutation operator biologically sound?

Open mauriceling opened this issue 10 years ago • 0 comments

In our philosophical paper, we need to know if computational mutation operators are good representation of biological mutations. Single mutation is easy to explain and argue for but repeated mutations can be tricky.

Volles and Lansbury (2005, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1166583/) provided data on error-prone PCR results which we can use as base data to compare against.

Basically, PCR is an experimental technique to amplify DNA where the number of DNA molecules double per cycle. From 1 --> 2 --> 4 --> 8... In the paper, 30 cycles were used, so 1 molecule is amplified to 1^9 molecules. The mutation rate is dependent on the enzyme used. In this case, it is Taq polymerase with the error rate of 1 point mutation per 9000 (http://www.ncbi.nlm.nih.gov/pubmed/2847780). Then they randomly pick 89 of the pool of 9 billion for evaluation (this is costly)

Allowable bases: a, t, g, c

Mutation rate: 1.0/9000

Initial sequence: ctttcaaaggccaaggagggagttgtggctgctgctgagaaaaccaaacagggtgtggcagaagcagcaggaaagacaaaagagggtgttctctatgtaggctccaaaaccaaggagggagtggtgcatggtgtggcaacagtggctgagaagaccaaagagcaagtgacaaatgttggaggagcagtggtgacgggtgtgacagcagtagcccagaagacagtggagggagcagggagcattgcagcagccactggctttgtcaaaaaggaccagttgggcaagaatgaagaaggagccccacaggaaggaattctggaagatatgcctgtggatcctgacaatgaggcttatgaaatgccttctgaggaagggtatcaagactacgaacctgaagcctaa

Volles and Lansburg (2005) results are as follows:

Wild-type base: A G C T A 11233 64 19 73 G 20 11979 2 12 C 8 2 6296 13 T 44 8 24 5975

289 errors out of 35772 bases examined, which gives the detected error rate of about 0.0081 detected mutation per base.

Let's see if we can replicate this 0.0081.

mauriceling avatar Mar 26 '14 08:03 mauriceling