Possible issue with the repository and collaboration
I am very interested in your work, especially in developing chemical databases, such as ACCDB.
I studied the database files in the repository.
Possibly, it is an issue with several files that I cannot fully understand—for example, https://github.com/peverati/ACCDB/blob/master/Databases/MetalsEE/4d-SSIP24/DatasetEval.csv . According to the description of the 4d-ssip24 set itself, it should include subsets SS16 (16 reactions) and IP8 (8 reactions). And there is also a SSIP24 subset in this file (24 reactions), which completely duplicates the two previous ones up to all values and coefficients. Similar problems are observed in many other files.
Additionally, I compared the number of files in subsets mentioned in the article https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.25761 and in the README.md file in the repository. There are mismatches in several databases such as MGCDB84 and GMTKN55. Could you explain the cause of that? Are the databases presented in the repository consistent with the published article? Where could I find the last tested version of the database?
I will be very grateful to you if you answer these questions. Moreover, I am currently working on a project in this area and would like to collaborate with you also in expanding the available chemical databases and, possibly, improving their usability for machine learning problems.