MrBayes icon indicating copy to clipboard operation
MrBayes copied to clipboard

Partitioned data with starting tree

Open pbravakos opened this issue 5 years ago • 8 comments

What is the issue that you are having?

I am running MrBayes in batch mode with partitioned data and a starting tree which are read correctly, but it gets stuck in the output bellow:

 Initial log likelihoods and log prior probs for run 1:
      Chain 1 -- -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000 -- -nan

   There are 11 more chains on other processor(s)


   Using a relative burnin of 25.0 % for diagnostics

   Chain results (90000000 generations requested):

      0 -- [-179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000] [...11 remote chains...]

What is the environment that you run MrBayes in?

  • Operating system (including variant and release): Debian 9
  • Version of MrBayes: MrBayes 3.2.7a x86_64
  • If possible, include the output of the Version command in MrBayes below:
Version

   MrBayes 3.2.7a

   Features:  SSE MPI
   Host type: x86_64-unknown-linux-gnu (CPU: x86_64)
   Compiler:  gnu 6.3.0


Other information that may be of use to us in resolving this issue

It seems to be running, and makes use of all available cpus, but no output is generated, even if i wait many hours. If i run the exact same command without a starting tree then i get a normal output (and a log likelihood, instead of "nan"). Thanks Panos

pbravakos avatar Jun 14 '19 09:06 pbravakos

Dear pbravakos, thank you for your report. Indeed, the output indicates an erroneous program behavior. We would, however, be helped by your exact input commands, and preferably also your input data (not shared by anyone else), in order to reproduce this issue. Please contact us personally if you wish to share your information. Cheers Johan

nylander avatar Jun 17 '19 13:06 nylander

Dear Johan, I sent you an email with the input data and commands. Hope you recieved it. Cheers Panos

pbravakos avatar Jun 17 '19 15:06 pbravakos

Dear Panos, Received your email. Thank you so much. /Johan

Den mån 17 juni 2019 17:52pbravakos [email protected] skrev:

Dear Johan, I sent you an email with the input data and commands. Hope you recieved it. Cheers Panos

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/NBISweden/MrBayes/issues/110?email_source=notifications&email_token=AANEKMO7HTY6ELRERHFHMUTP26XMBA5CNFSM4HYGJYZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX3TRZI#issuecomment-502741221, or mute the thread https://github.com/notifications/unsubscribe-auth/AANEKMPL7NJZYY4ZCU6P65TP26XMBANCNFSM4HYGJYZA .

nylander avatar Jun 18 '19 06:06 nylander

gh-110

  • Last modified: tor jun 20, 2019 02:41
  • Sign: JN

Description

When analyzing partitioned protein data with a mixed aa model, likelihood calculations will fail unless one unlinks the aamodel. See issue #112.

In addition, user tree contain zero-length branches, which, in general, poses problems when used in optimization. With this example data, the program freezes after N generations.

Workaround

  • Apply unlink aamodel=(all)
  • Replace zero-length branches. Using, for example:
perl -pe 'if (/^\s+tree\s+PhyMLTree/){s/0\.0+(,|\))/0\.000000001$1/g}' Pseudomonas_MSA_MrBayes.nex > Pseudomonas_MSA_MrBayes_no_zero_length_branches.nex

Original commands, with workarounds added

begin mrbayes;
    set autoclose=yes nowarn=yes;
    exe Pseudomonas_MSA_MrBayes_no_zero_length_branches.nex;
    lset applyto=(all) nucmodel=protein rates=invgamma ngammacat=4;
    unlink statefreq=(all) revmat=(all) shape=(all) pinvar=(all) aamodel=(all);
    prset applyto=(all) ratepr=variable aamodelpr=mixed topologypr=uniform;
    report siterates=yes;
    startvals tau=PhyMLTree V=PhyMLTree;
    mcmcp filename=Pseudomonas_MrBayes ngen=100000 nruns=2 nchains=4 relburnin=yes burninfrac=0.25 mcmcdiagn=yes;
    mcmcp stoprule=yes stopval=0.01 savetrees=yes Checkpoint=yes Checkfreq=5000 Printfreq=10;
    mcmcp nperts=6;
    mcmc;
    sumt contype=Halfcompat ntrees=1 conformat=figtree Showtreeprobs=No;
    sump;
    quit;
end;

Comments

  • lset nucmodel=protein not applicable to amino acid data.

nylander avatar Jun 20 '19 13:06 nylander

Closing this and referring to issue #112

nylander avatar Jun 20 '19 13:06 nylander

Dear Johan, Thanks for the immediate response. I followed the instructions (both for the tree in the nexus file and the MrBayes command) but I think I still have a problem. Now, I get a normal log likelihood but it seems that the analysis stagnates after a few generations. My output now is:

Initial log likelihoods and log prior probs for run 1:
      Chain 1 -- -14244.398223 -- 67.701678

   There are 11 more chains on other processor(s)


   Using a relative burnin of 25.0 % for diagnostics

   Chain results (90000000 generations requested):

      0 -- [-14244.398] [...11 remote chains...]
   1000 -- (-10163.815) [...11 remote chains...] -- 1774:58:48
   2000 -- (-9254.135) [...11 remote chains...] -- 2162:27:06
   3000 -- (-9138.519) [...11 remote chains...] -- 2374:55:15
   4000 -- (-9109.190) [...11 remote chains...] -- 2524:53:16
   5000 -- (-9096.552) [...11 remote chains...] -- 2649:51:10

   Average standard deviation of split frequencies: 0.123913

   6000 -- (-9102.142) [...11 remote chains...] -- 2828:58:41

No matter how long i wait, no more generations seem to be calculated. The number of generations at which the analysis seems to get stuck is not something constant, but can change. I was wondering if this is a problem of my system or a bug of the program. Thanks Panos

pbravakos avatar Jun 21 '19 18:06 pbravakos

Dear Panos, I'm assuming this is the bug. We have to await the development in issue #112 /Johan

nylander avatar Jul 15 '19 10:07 nylander

Issue 112 fixed but it seems like there is an additional issue with the run hanging after some generations with this dataset. I will look into this.

ronquist avatar Dec 06 '19 15:12 ronquist