RMG-Py
RMG-Py copied to clipboard
Exclude surf training
Motivation
In some cases, in particular for surface systems, it is advantageous to exclude the training data for a given family, and only rely on the more general rules. This in part due to the heterogeneity of the data obtained from different sources, which can also throw off estimates for systems that may be different from the training data (i.e. a different metals or facets). It is also useful for performing correlated uncertainty for Catalytic systems, such as the study performed by Krietz et al..
Description of Changes
The following change makes it so that an RMG input file can accept arguments for the "kineticsFamilies" input similar to the ones used for reactionLibraries:
kineticsFamilies =[
'Surface_Adsorption_Single',
('Surface_Dissociation', True), # True to exclude training data
]
This makes selectively including/excluding training data from specific families much easier. it can also be done for groups of families:
kineticsFamilies =[
('surface', True),
'default',
]
Functionally, if training data is completely excluded, exact matches with training data will be ignored, and the kinetics family tree is filled in by averaging. This PR basically does what adding "!Training" to kineticsDepositories does, just more selectively.
Testing
I have added to the unit tests to ensure that the existing methods for adding data are still functional. The set methods implemented by max are a little slower as a result of introducing a dictionary, so the tests are a little longer.
I have also run several mechanisms on this to ensure that only the families I want to exclude training data for exclusively use the rate rules specified in rules.py. I have attached the input file and results of one such test here.
Reviewer Tips
I would like initial feedback on
- how to make it more pythonic
- how to make it faster, if possible or necessary (it does not slow down execution very much)
- what other unit tests I might have missed.