hyphy-analyses
hyphy-analyses copied to clipboard
Calculating gene wide dn/dS on foreground vs. background branches
Hi, Is there a way to estimate gene-wide dN/dS on a set of test foreground vs. background branches for several genes as done in these two papers. I have attached the figures where they estimated average gene dN/dS ratios for test vs. foreground branches on a tree. I understand this approach may have some methodological flaws, but since I wanted to compare our results with what they found, if there is way to do this in the recent Hyphy models, it would be useful to know.
https://academic.oup.com/mbe/article/40/11/msad241/7341929
https://www.nature.com/articles/s42003-021-01688-z
Dear @yashsondhi,
With stock analyses with no changes at all you can do something like this. Let me know if this helps/makes sense.
You can estimate per-branch dN/dS
using FitMG94.
For example, if you take the blue-opsin from https://datadryad.org/stash/dataset/doi:10.5061/dryad.gmsbcc2kr (one the papers you cited; I attach the .phy file and the .nwk file with PAML foreground #1 notation replaced with {FOREGROUND}
in HyPhy notation), you can run
hyphy FitMG94.bf
--alignment /Users/sergei/Downloads/doi_10-13/opsin_evolution_Fig2\ 2/24_taxa_analysis/blue_opsin/branch/blue_dna_trim.phy
--tree /Users/sergei/Downloads/doi_10-13/opsin_evolution_Fig2\ 2/24_taxa_analysis/blue_opsin/branch/blue_dna_trim.nwk
--type local
The results will be available in the JSON file and printed to the screen (including some infinite estimates).
| Branch | Length | dN/dS |Approximate dN/dS CI|
|:----------------------------:|:--------------:|:--------------:|:------------------:|
|Nepticuloidea_Nepticulidae_...| 0.345 | 0.123 | 0.096 - 0.154 |
|Eriocranioidea_Eriocraniida...| 0.768 | 0.019 | 0.013 - 0.026 |
|Hepialoidea_Hepialidae_Trio...| 0.268 | 0.055 | 0.037 - 0.078 |
| Node4 | 0.021 |10000000000.0...|0.000 - 10000.000...|
|Adeloidea_Adelidae_Adelinae...| 0.332 | 0.043 | 0.029 - 0.060 |
| Node3 | 0.038 | 0.099 | 0.034 - 0.205 |
|Papilionoidea_Lycaenidae_Po...| 0.406 | 0.087 | 0.068 - 0.110 |
|Papilionoidea_Nymphalidae_S...| 0.140 | 0.031 | 0.014 - 0.056 |
| Node15 | 0.025 | 0.176 | 0.061 - 0.362 |
|Papilionoidea_Nymphalidae_D...| 0.300 | 0.047 | 0.032 - 0.066 |
| Node14 | 0.049 | 0.069 | 0.025 - 0.141 |
|Papilionoidea_Nymphalidae_H...| 0.218 | 0.043 | 0.027 - 0.065 |
| Node13 | 0.074 | 0.132 | 0.077 - 0.206 |
|Pyraloidea_Pyralidae_Amyelo...| 0.208 | 0.021 | 0.010 - 0.037 |
| Node12 | 0.006 |10000000000.0...|0.000 - 10000.000...|
|Drepanoidea_Drepanidae_Drep...| 0.187 | 0.012 | 0.005 - 0.025 |
|Drepanoidea_Drepanidae_Drep...| 0.186 | 0.019 | 0.009 - 0.034 |
| Node24 | 0.051 | 0.027 | 0.006 - 0.068 |
|Bombycoidea_Bombycidae_Bomb...| 0.234 | 0.013 | 0.006 - 0.024 |
|Noctuoidea_Notodontidae_Nys...| 0.159 | 0.010 | 0.003 - 0.023 |
| Node27 | 0.043 | 0.000 | 0.000 - 0.016 |
| Node23 | 0.003 | 0.000 | 0.000 - 0.318 |
|Noctuoidea_Erebidae_Arctiin...| 0.188 | 0.044 | 0.027 - 0.067 |
|Noctuoidea_Erebidae_Calpina...| 0.130 | 0.003 | 0.000 - 0.014 |
| Node31 | 0.071 | 0.022 | 0.007 - 0.052 |
|Noctuoidea_Noctuidae_Amphip...| 0.087 | 0.030 | 0.012 - 0.060 |
|Noctuoidea_Noctuidae_Noctui...| 0.105 | 0.038 | 0.019 - 0.068 |
| Node34 | 0.039 | 0.076 | 0.030 - 0.152 |
| Node30 | 0.035 | 0.043 | 0.012 - 0.105 |
| Node22 | 0.013 | 0.187 | 0.045 - 0.458 |
|Bombycoidea_Sphingidae_Sphi...| 0.169 | 0.030 | 0.016 - 0.051 |
| Node21 | 0.013 | 0.625 | 0.244 - 1.226 |
| Node11 | 0.051 | 0.011 | 0.000 - 0.047 |
|Papilionoidea_Papilionidae_...| 0.118 | 0.016 | 0.005 - 0.036 |
|Papilionoidea_Papilionidae_...| 0.149 | 0.041 | 0.023 - 0.067 |
| Node38 | 0.176 | 0.056 | 0.035 - 0.083 |
| Node10 | 0.016 | 0.096 | 0.008 - 0.285 |
|Alucitoidea_Alucitidae_Aluc...| 0.161 | 0.088 | 0.059 - 0.124 |
| Node9 | 0.012 | 0.052 | 0.000 - 0.232 |
|Papilionoidea_Hesperiidae_H...| 0.238 | 0.107 | 0.079 - 0.141 |
| Node8 | 0.021 | 0.196 | 0.071 - 0.404 |
| Node2 | 0.016 |10000000000.0...|0.000 - 10000.000...|
|Tischerioidea_Tischeriidae_...| 0.324 | 0.031 | 0.019 - 0.046 |
To get the dN/dS estimates from JSON files, you can use the jq
tool, like so
jq -r '["Branch", "dN/dS"], (.["branch attributes"]["0"] | to_entries | map ([.key,.value["Confidence Intervals"].MLE])[]) | @csv' paht/to/blue_dna_trim.phy.FITTER.json
You can also use many of the standard analyses, e.g. BUSTED
to compute mean dN/dS estimates for branch groups (i.e. estimate the shared dN/dS parameter as opposed to averaging per-branch estimates post hoc
). This is a bit of an overkill, because BUSTED
uses these estimates only as an initial guess for subsequent optimizations.
hyphy busted
--alignment /Users/sergei/Downloads/doi_10-13/opsin_evolution_Fig2\ 2/24_taxa_analysis/blue_opsin/branch/blue_dna_trim.phy
--tree /Users/sergei/Downloads/doi_10-13/opsin_evolution_Fig2\ 2/24_taxa_analysis/blue_opsin/branch/blue_dna_trim.nwk
--branches Foreground
--srv No
...
### Improving branch lengths, nucleotide substitution biases, and global dN/dS ratios under a full codon model
* Log(L) = -11214.94, AIC-c = 22548.88 (59 estimated parameters)
* non-synonymous/synonymous rate ratio for *background* = 0.0370
* non-synonymous/synonymous rate ratio for *test* = 0.0567
And to get them from the BUSTED .json, use
jq -r '["Branches", "dN/dS"], (.fits["MG94xREV with separate rates for branch sets"]["Rate Distributions"] | to_entries | map ([.key,.value[0][0]])[]) | @csv ' /path/to/blue_dna_trim.phy.BUSTED.json
Best, Sergei
Hi Sergei, This is most helpful, thank you so much, this is exactly what I was looking for. Cheers
On Mon, Nov 20, 2023, 7:01 PM Sergei Pond @.***> wrote:
Dear @yashsondhi https://github.com/yashsondhi,
With stock analyses with no changes at all you can do something like this. Let me know if this helps/makes sense.
You can estimate per-branch dN/dS using FitMG94 https://github.com/veg/hyphy-analyses/tree/master/FitMG94. For example, if you take the blue-opsin from https://datadryad.org/stash/dataset/doi:10.5061/dryad.gmsbcc2kr (one the papers you cited; I attach the .phy file and the .nwk file with PAML foreground #1 https://github.com/veg/hyphy-analyses/issues/1 notation replaced with {FOREGROUND} in HyPhy notation), you can run
hyphy FitMG94.bf --alignment /Users/sergei/Downloads/doi_10-13/opsin_evolution_Fig2\ 2/24_taxa_analysis/blue_opsin/branch/blue_dna_trim.phy --tree /Users/sergei/Downloads/doi_10-13/opsin_evolution_Fig2\ 2/24_taxa_analysis/blue_opsin/branch/blue_dna_trim.nwk --type local
The results will be available in the JSON file and printed to the screen (including some infinite estimates).
Branch Length dN/dS Approximate dN/dS CI Nepticuloidea_Nepticulidae_... 0.345 0.123 0.096 - 0.154 Eriocranioidea_Eriocraniida... 0.768 0.019 0.013 - 0.026 Hepialoidea_Hepialidae_Trio... 0.268 0.055 0.037 - 0.078 Node4 0.021 10000000000.0... 0.000 - 10000.000... Adeloidea_Adelidae_Adelinae... 0.332 0.043 0.029 - 0.060 Node3 0.038 0.099 0.034 - 0.205 Papilionoidea_Lycaenidae_Po... 0.406 0.087 0.068 - 0.110 Papilionoidea_Nymphalidae_S... 0.140 0.031 0.014 - 0.056 Node15 0.025 0.176 0.061 - 0.362 Papilionoidea_Nymphalidae_D... 0.300 0.047 0.032 - 0.066 Node14 0.049 0.069 0.025 - 0.141 Papilionoidea_Nymphalidae_H... 0.218 0.043 0.027 - 0.065 Node13 0.074 0.132 0.077 - 0.206 Pyraloidea_Pyralidae_Amyelo... 0.208 0.021 0.010 - 0.037 Node12 0.006 10000000000.0... 0.000 - 10000.000... Drepanoidea_Drepanidae_Drep... 0.187 0.012 0.005 - 0.025 Drepanoidea_Drepanidae_Drep... 0.186 0.019 0.009 - 0.034 Node24 0.051 0.027 0.006 - 0.068 Bombycoidea_Bombycidae_Bomb... 0.234 0.013 0.006 - 0.024 Noctuoidea_Notodontidae_Nys... 0.159 0.010 0.003 - 0.023 Node27 0.043 0.000 0.000 - 0.016 Node23 0.003 0.000 0.000 - 0.318 Noctuoidea_Erebidae_Arctiin... 0.188 0.044 0.027 - 0.067 Noctuoidea_Erebidae_Calpina... 0.130 0.003 0.000 - 0.014 Node31 0.071 0.022 0.007 - 0.052 Noctuoidea_Noctuidae_Amphip... 0.087 0.030 0.012 - 0.060 Noctuoidea_Noctuidae_Noctui... 0.105 0.038 0.019 - 0.068 Node34 0.039 0.076 0.030 - 0.152 Node30 0.035 0.043 0.012 - 0.105 Node22 0.013 0.187 0.045 - 0.458 Bombycoidea_Sphingidae_Sphi... 0.169 0.030 0.016 - 0.051 Node21 0.013 0.625 0.244 - 1.226 Node11 0.051 0.011 0.000 - 0.047 Papilionoidea_Papilionidae_... 0.118 0.016 0.005 - 0.036 Papilionoidea_Papilionidae_... 0.149 0.041 0.023 - 0.067 Node38 0.176 0.056 0.035 - 0.083 Node10 0.016 0.096 0.008 - 0.285 Alucitoidea_Alucitidae_Aluc... 0.161 0.088 0.059 - 0.124 Node9 0.012 0.052 0.000 - 0.232 Papilionoidea_Hesperiidae_H... 0.238 0.107 0.079 - 0.141 Node8 0.021 0.196 0.071 - 0.404 Node2 0.016 10000000000.0... 0.000 - 10000.000... Tischerioidea_Tischeriidae_... 0.324 0.031 0.019 - 0.046 To get the dN/dS estimates from JSON files, you can use the jq tool, like so
jq -r '["Branch", "dN/dS"], (.["branch attributes"]["0"] | to_entries | map ([.key,.value["Confidence Intervals"].MLE])[]) | @csv' paht/to/blue_dna_trim.phy.FITTER.json
You can also use many of the standard analyses, e.g. BUSTED to compute mean dN/dS estimates for branch groups (i.e. estimate the shared dN/dS parameter as opposed to averaging per-branch estimates post hoc). This is a bit of an overkill, because BUSTED uses these estimates only as an initial guess for subsequent optimizations.
hyphy busted --alignment /Users/sergei/Downloads/doi_10-13/opsin_evolution_Fig2\ 2/24_taxa_analysis/blue_opsin/branch/blue_dna_trim.phy --tree /Users/sergei/Downloads/doi_10-13/opsin_evolution_Fig2\ 2/24_taxa_analysis/blue_opsin/branch/blue_dna_trim.nwk --branches Foreground --srv No ...
Improving branch lengths, nucleotide substitution biases, and global dN/dS ratios under a full codon model
- Log(L) = -11214.94, AIC-c = 22548.88 (59 estimated parameters)
- non-synonymous/synonymous rate ratio for background = 0.0370
- non-synonymous/synonymous rate ratio for test = 0.0567
And to get them from the BUSTED .json, use
jq -r '["Branches", "dN/dS"], (.fits["MG94xREV with separate rates for branch sets"]["Rate Distributions"] | to_entries | map ([.key,.value[0][0]])[]) | @csv ' /path/to/blue_dna_trim.phy.BUSTED.json
Best, Sergei
Archive.zip https://github.com/veg/hyphy-analyses/files/13420393/Archive.zip
— Reply to this email directly, view it on GitHub https://github.com/veg/hyphy-analyses/issues/47#issuecomment-1819998574, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEAELVIUHQAQ3KCCL6Y376LYFPVMDAVCNFSM6AAAAAA7TC7NYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJZHE4TQNJXGQ . You are receiving this because you were mentioned.Message ID: @.***>