sigma
sigma copied to clipboard
Identification list for species:
Chemical names are nice for humans, but less practical for computers. It would be nice if the cavities were accompanied by a list of CAS numbers (asked for by many journals nowadays too), to uniquely identify each compound.
There is an online tool available - and I have been running the names through the tool. (This is the one I used: http://cts.fiehnlab.ucdavis.edu/ ) I did not (!) verify every match of name to CAS number, but I suspect between nothing or some errors, some errors are the lesser evil. In addition, I determined that using the cavities with the "regular sigma range" is not posible for ions. Radicals as well as some heavier atom also seem to cause issues.
Here is the 'result' of matching names to CAS numbers for the MOPAC cavities: POA1_working_compounds.txt POA1_ions_radicals_CAS_not_found.txt
In the same effort, I just filtered the list for the GAMESS cavities:
Side note:
I have also identified an issue with a CAS number. The tool gave me water as '13670-17-2' which is heavy water... - Normal water is '7732-18-5'
There is also at least one duplicate in the GAMESS database: tetrachloroethylene and tetrachloroethene.
Thanks for the lists. We could try to improve on this on the future. Some points to keep in mind:
- How to handle cations, anions and other possible intermediate radicals without CAS
- How to handle multiple conformers of the same molecule (currently we are providing only one conformer)
Regarding the usual range of -0.025 to 0.025, this can happen. We are providing here the 'raw' apparent surface charges, after 'averaging' the surface charges this is less likely to happen.
Well, there is an extension to COSMO-SAC for electrolytes: https://pubs.acs.org/doi/abs/10.1021/ie100689g Though that isn't of interest to me at current - so I cannot comment on it any further. (And one more paper: https://www.sciencedirect.com/science/article/pii/S0378381218300347 )
In the case of the sigma profiles, where the range was exceeded, this was for the averaged charge density. Expanding the range takes care of the problem - or changing the parameterisation also would. But as mentioned above, I am at present not interested in ions - so it is easier for me to just remove them. (And place them in a dedicated list.) For all stable stepcies, the range of the averaged sigma profile does not exceed the parameterisation range.
As to how to properly handle ions and conformers: I don't know. I do know that other people are also interested in conformers. Whether I will work with them in the future, I do not know. I would possibly suggest using a slightly more complex naming pattern: either using a "fake CAS number" to append information, say "-c00001", "-c00002" etc. and leave the treatment to code, or an additional column that provides an integer counter for the conformer number with identical CAS numbers. - I'm sure that different people will have different favoured approaches.