gubbins icon indicating copy to clipboard operation
gubbins copied to clipboard

RaxML fails -- Empirical base frequency for state number 2 is equal to zero in DNA data partition No Name Provided

Open ramadatta opened this issue 4 years ago • 3 comments

Hi,

I have some identical sequences in my input fasta but still would like to keep them in my analysis and run gubbins. I have the following error in regards to this. May I kindly know, how to overcome this issue? Thanks.


[IMPORTANT WARNING
Found 13 sequences that are exactly identical to other sequences in the alignment.
Normally they should be excluded from the analysis.

An alignment file with sequence duplicates removed has already
been printed to file PreGubbins_withRef.fasta.phylip.reduced`

Using BFGS method to optimize GTR rate parameters, to disable this specify "--no-bfgs" 

Alignment has 2 distinct alignment patterns

Proportion of gaps and completely undetermined characters in this alignment: 0.00%

RAxML rapid hill-climbing mode

Using 1 distinct models/data partitions with joint branch length optimization


Executing 1 inferences on the original alignment using 1 distinct randomized MP trees

All free model parameters will be estimated by RAxML
ML estimate of 25 per site rate categories

Likelihood of final tree will be evaluated and optimized under GAMMA

GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units

Partition: 0
Alignment Patterns: 2
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR

RAxML was called as follows:

raxmlHPC-SSE3 -f d -p 1 -m GTRCAT -V -s PreGubbins_withRef.fasta.phylip -n PreGubbins_withRef.iteration_1 

Partition No Name Provided number 0 has a problem, the number of expected states is 4 the number of states that are present is 3.
Please go and fix your data!

Empirical base frequency for state number 2 is equal to zero in DNA data partition No Name Provided
Since this is probably not what you want to do, RAxML will soon exit.

Failed while building the tree.

ramadatta avatar Aug 30 '19 05:08 ramadatta

I have the same error. Did you find a solution?

yuanw-18 avatar Feb 23 '22 12:02 yuanw-18

@yuanw-18 I can't remember if this is possible in versions < 3, but you can try using the Jukes-Cantor or K2P models to avoid having to estimate empirical base frequencies. This can certainly be done in the newer versions (> 3), or you can switch away from RAxML to see if other tree building software might work.

nickjcroucher avatar Feb 23 '22 12:02 nickjcroucher

Thank you. I will try

yuanw-18 avatar Feb 23 '22 12:02 yuanw-18