hyphy
hyphy copied to clipboard
Foreground vs background in BUSTED
Hello!
I have a question about BUSTED that I am hoping you can help with.
My dataset is a population of SARS-CoV-2 viruses that has been aligned to the original Wuhan reference sequence and trimmed to the individual genes.
When running BUSTED, would it be correct to set the Wuhan reference sequence as background while leaving the rest of the dataset as foreground branches for testing? I am unsure whether setting the reference sequence as background means it is completely ignored (and testing only involves the foreground), or whether it means that selection is determined for the foreground branches relative to the background.
Alternatively, would it be more appropriate to select the entire dataset as foreground and not specify any background branches if my potential background (the Wuhan reference) is only a single sequence?
Thank you!
Dear @katherine-li07,
My standard recommendation here (viral isolates, where one sequence = one patient) is to run BUSTED on internal branches only (from https://pubmed.ncbi.nlm.nih.gov/26814962/)

Also, one sequence should not affect such an analysis in most cases, and, anyway, excluding all terminal branches will also exclude the Wu-1 reference.
Hope this helps, Sergei
Great, thank you for your help!
Dear @katherine-li07,
Happy to help. Let me know how it goes. For SC-2 data, you should also remove all identical sequences (HyPhy will warn you if you have those), because they will make the analysis to run slower and not contribute any signal to the test.
See https://github.com/veg/hyphy-analyses/tree/master/remove-duplicates
Best, Sergei
Hi Sergei,
I am running on the Datamonkey web server, which I believe already removes duplicate sequences automatically.
I have been able to run all of my sequences without any trouble except for one, which seems to be stuck somewhere. It says "Could not contact the server for job status updates" and has disappeared from the queue, but still appears to be running. Ticket number 2332809.new-silverback. Do you know how I can resolve this?
Thanks! Katherine
Dear @katherine-li07,
The job completed successfully, but there must have been a communication error within our site. I can either email the results to you or place them here based on your wishes.
If you would like the results emailed, please contact me at [email protected].
Best, Steven
Thank you Steven and Sergei for your help.
I am also wondering how BUSTED results relate to FUBAR results. I understand that FUBAR reports positive and negative selection at individual sites, whereas BUSTED reports positive selection across an entire gene, but should these results generally mimic each other for where selection is reported?
For example, I have run the same files through both BUSTED and FUBAR, and in some cases FUBAR reported multiple sites under positive selection, but BUSTED did not detect selection for the gene.
In this case, I am wondering if it is appropriate to compare the results of these two tools, or if I should be sticking to one over the other.
Thank you! Katherine
Stale issue message