pymatgen Bond Valence equation error

Problem

In the pymatgen/pymatgen/analysis/bond_valence.py module in the two functions calculate_bv_sum and calculate_bv_sum_unordered there is an error in the value of b parameter of the BV equation. In the two functions the BV for a given bond is calculated as vij = exp((R - nn.nn_distance * scale_factor) / 0.31) with b parameter written as 0.31, though the original paper reads "Here b is commonly taken to be a “universal” constant equal to 0.37 Á; we use this equation with this value of b throughout."

Proposed Solution

Just substitute 0.31 with 0.37 in the equation

Alternatives

No response

Nov 19 '23 16:11 trioxane

The bond valence values are coupled to statistics related to the oxidation states. Even if the value might not be correct, we would need to keep it consistent with the model for the oxidation states. Thus, a potential solution would be much more complex.

Nov 19 '23 16:11 JaGeo

In any case, thank you for pointing this out.

Nov 19 '23 16:11 JaGeo

@shyuep Do you maybe have more insight into the development of this method and potential reasons for the difference between paper and implementation? (The scale factor alone does not make up for the deviation but I haven't checked all of the ionic radii and potential scaling there)

Nov 20 '23 07:11 JaGeo

To be honest, this is so long ago that I don't really remember. If I don't recall wrongly, the actual implementation was based on the work of Stefan Adams (on using BV to calculate diffusion pathways) and my hazy memory suggests that one of Stefan's work provided the value of 0.37 and I confirmed it verbally with him. But feel free to check his work. It is also possible this was an implementation error. But I don't really think the values make that huge a difference in qualitative analyses, and BV is ultimately a highly empirical method.

Nov 21 '23 18:11 shyuep

We could add a note that this discrepancy of 0.31 and 0.37 exists or make it adjustable, but as the oxidation state estimator depends on this method, we cannot really change the default without causing a major mess. 0.31 that is implemented here, is the constant suggested by Pauling but not the one that shows up in later work (0.37).

@shyuep opinions?

Nov 22 '23 07:11 JaGeo

However, I am also still not getting the bond valance parameters in the file. According to the paper, there should also be an oxidation state dependency. Maybe, it is indeed an adapted model. I don't really have time to investigate in more detail.

Nov 22 '23 07:11 JaGeo

I could get a couple of words in edgewise. As far as I understand the 0.37<>0.31 substitution is just a mistake because of the explanation in the text written not enough clearly and coincidental mentioning of the very old work of Pauling where he used b=0.31 in the next sentence. When skimming through the text, one can easily perceive 0.31 as the b parameter of the equation used in the cited work.

The dependence on the oxidation state indeed exists, but it is really small and only for some bonds it is significant, like those mentioned in the article: "Omitted were data for the following bonds: O-F, O-O, F-F, Cu-O, Cu-F Ag-O, and Ag-F. R for the first three are much longer (by about 0.2 A) than predicted; R for the last four are known to depend strongly on the oxidation state of the metal and they are also significantly longer than predicted".

The whole method presented in the paper is devoted to finding R BV parameter (in the form R =ri + rj - f(ci,cj,ri,rj)) for bonds to electronegative elements. The b parameter though is fixed at 0.37, since many empirical R values for bonds are derived with the same b=0.37 BV parameter. Quote from the article: "We used 600 values of R determined from crystal and molecular structures for bonds to as many as 16 different “electronegative" elements and found the best values of c and r that minimized the squared deviation of the calculated and observed values of R for 75 elements." Again, no oxidation state dependence is envisaged, which makes this approach universal.

Nov 22 '23 08:11 trioxane

Thanks for clarifying! (Sorry for just skimming through the paper)

We still have the issue that the subsequent oxidation state model is based on old parameters.

Nov 22 '23 09:11 JaGeo

@trioxane Are the parameters listed here agreeing with the publication from 1991?

pymatgen/analysis/bvparam_1991.yaml

Nov 22 '23 09:11 JaGeo

Yes, these are the r and c parameters from the Table 1 in the cited article used for calculating R values for different bonds to electronegative elements.

Nov 22 '23 13:11 trioxane

I also went back to the implementation and especially the documentation of the oxidation state predictor.

As far as I understand, the oxidation states are estimated by a Naive Bayes estimator. The p(bv/ox_state) is estimated by a 1d normal distribution and p(ox_state) is simply the relative frequency of the ox_state. Then, one uses the bond valence sum to get the probabilities. I assume that the gaussian normal distribution for each oxstate has been computed by getting the mean bond valence sum and also the standard deviation (but this isn't exactly clear). Thus, a naive Bayes classifier.

The documentation calls it a "maximum a posteriori estimation method". I would like to replace this with "Naive Bayes classifier". https://github.com/materialsproject/pymatgen/blob/ac8a7e9cd93ba9f7afeaced45dc28af6d4e151d4/pymatgen/analysis/bond_valence.py#L104

The documentation seems to mix up posterior and prior here: https://github.com/materialsproject/pymatgen/blob/ac8a7e9cd93ba9f7afeaced45dc28af6d4e151d4/pymatgen/analysis/bond_valence.py#L113

@shyuep do you maybe remember more details? If you agree, I would at least like to update the documentation and maybe also add a warning about the 0.31 somewhere in the documentation of the code.

Jan 30 '24 17:01 JaGeo

I haven’t caught up with this issue yet, but wanted to raise awareness that there are much newer datafiles of bond valence parameters available from the IUCr:

https://www.iucr.org/resources/data/datasets/bond-valence-parameters

Jan 30 '24 20:01 mkhorton

Similar to the mentioned issue above (wrong b value), this would mean recomputing the classifier again (or using the model with different parameters?). I have also no real feeling if these changes will affect the classifier much.

Jan 30 '24 21:01 JaGeo

I know this is not really an urgent matter, but as this oxidation state predictor is an important part of many subsequent codes, the MP website, it might be good to adapt the documentation. Happy to do it. I just want to know that I didn't misunderstood the method.

Feb 03 '24 12:02 JaGeo

pymatgen pymatgen copied to clipboard

Bond Valence equation error

Problem

Proposed Solution

Alternatives

pymatgen
pymatgen copied to clipboard