RMG-database icon indicating copy to clipboard operation
RMG-database copied to clipboard

Fix training degeneracy

Open goldmanm opened this issue 7 years ago • 12 comments

This PR:

  1. creates a script to find 'incorrect' degeneracy values among training reactions.
  2. fixes the 'incorrect' degeneracy values among all training reactions.

In this case 'incorrect' degeneracy means the value in the database is different from what RMG would predict (leading to an incorrect rate rule).

Currently, the script makes some incorrect assessments without the bugfix in calculateDegeneracy RMG PR 1276.

For someone to review this PR, I think running the script and reading through the script code to see if it makes sense should suffice (along with other recommendations).

If you think the script should be turned also a databaseTest, that can also be done.

goldmanm avatar Feb 16 '18 16:02 goldmanm

Are we reasonably sure that the rates of all training reactions are the overall rates (rather than per reaction site)? Is there any way to be sure? Just interested in your thoughts.

mliu49 avatar Feb 16 '18 16:02 mliu49

I am not sure that the rates are all the rates in the training reaction depositories are overall rates (as opposed to per reaction site). I did check a few of the rates when doing isotope work, which led to an error. Since the training reactions are more recent, they are more likely to be standardized. I am quite confident that some rate rules input by Harper were overall instead of per site.

If people put in the training reaction per-site with a degeneracy of 1, then we would still predict the incorrect rate when finding exact match, so it isn't even a correct way to input them.

Many of the reactions which I changed seemed to be overlooked errors, as opposed to putting 1 for degeneracy and scaling the rate by the actual degeneracy.

goldmanm avatar Feb 16 '18 19:02 goldmanm

I'll go through each change I made here, and give justification:

The current branch has a degeneracy of 3 listed here.

C2H + CH3O <=> C2H2 + CH2O

The person probably thought CH3O has a radical on the oxygen. According to the species dictionary, the smiles would be [CH2]O. There is only one hydrogen on the oxygen, so the degeneracy should be one.

Another example, the degeneracy previously listed here was 1

"NH + NO2_r <=> HNNO2"

For this Birad R-recombination, 1 makes intuitive sense. However, there are two resonance structures of NO2_r which make this reaction have a degeneracy of 2.

The same mistake occured with "S1C4 <=> S1C4b", "NO2 + NO2 <=> N2O4" (including forgetting to decrease RPD of same reactants).

The same reactant mistake also occured in "CH3 + CH3 <=> C2H6" and "C2H5 + C2H5 <=> C4H10"

I checked that the rates in training reactions matched kinetics libraries for "C6H6 <=> C6H6-2", "C6H6-3 <=> C6H6-4", "C6H9 <=> C6H9-2", "C7H11 <=> C7H11-2","C7H9 <=> C7H9-2","C7H9-3 <=> C7H9-4","C6H6 <=> C6H6-2","C6H6 <=> C6H6-2","C6H6-3 <=> C6H6-4","C7H9 <=> C7H9-2","C7H9-3 <=> C7H9-4","C6H7 <=> C6H7-2","C10H11-3 <=> C10H11-4","C6H7-7 <=> C6H7-8",.. (i skipped a few of the vinylCPD_H and C3 libraries if they occured in the same batch)

goldmanm avatar Feb 16 '18 19:02 goldmanm

To summarize, I am reasonably confident that all the reactions that had degeneracy changed were input on a per molecule basis.

I can't say anything about all training reactions

goldmanm avatar Feb 16 '18 19:02 goldmanm

I still think that the degeneracy for NH + NO2_r <=> HNNO2 should be 1 (this is an interesting case now that I think about it, thanks for bringing it up).

Generating resonance structures for NO2 keeping isomorphic structures gives: image image image image The reaction here deals with the top two, where the rad site in on the N.

Perhaps I'm wrong, but since there's only one N radical site to react, and the O groups on each end are "identical", the degeneracy should be 1.

alongd avatar Feb 16 '18 19:02 alongd

@alongd, Nice observation! I completely agree with you that the degeneracy of this reaction which makes most sense is 1.

To keep RMG accurate, the degeneracy value listed in the training reaction should be the same as the degeneracy value that RMG would output for the reaction.

Since RMG outputs a degeneracy of 2 for the reaction, not listing the degeneracy as 2 here will overestimate a reaction like CH2(triplet) + NO2 -> HCNO2 by a factor of two.

It's possible to further refine the degeneracy algorithm, and when the degeneracy calculation is updated, the database should be changed to reflect this. I am hoping this pull request makes the entire process of updating degeneracy values in the database easier.

goldmanm avatar Feb 16 '18 20:02 goldmanm

I just updated to current master and reran fixTrainingReactionDegenerarcies, and all the values in the database have the same degeneracy as listed by RMG.

goldmanm avatar Apr 24 '18 15:04 goldmanm

@alond, once this PR gets merged in, I can create a RMG-Py PR to add the database test.

goldmanm avatar May 31 '18 14:05 goldmanm

I have rebased this branch and found additional corrections in degeneracy (from training reaction additions since this PR was made). I will implement these fixes and then this PR should be ready for merging, given other people agree.

goldmanm avatar May 31 '18 15:05 goldmanm

OK, overall looks very good, thanks @goldmanm

alongd avatar May 31 '18 15:05 alongd

Just fixed all the degeneracy errors since this pull request was first introduced.

goldmanm avatar May 31 '18 15:05 goldmanm

@goldmanm, are we waiting to hear back regarding the unchanged CH3+CH3 rate in one of the tests as a sample case before merging this in?

alongd avatar May 31 '18 23:05 alongd

PR abandoned - @jonwzheng and I will consider adding robust degeneracy checks during the rewrite, as well as check the current tests to see if this was rolled in elsewhere.

JacksonBurns avatar Aug 14 '24 17:08 JacksonBurns