keras-molecules
keras-molecules copied to clipboard
LogP calculations for 50k dataset
LogP calculations for the 50k dataset. Same order as original SMILES file.
What's the origin of these? Are these measured logPs based on looking up the SMILES? Or cLogP from rdkit? Something else?
Sorry, I should have specified. The LogP values were generated using ChemAxon's "generatemd" program from this repository's "smiles_50k.h5" file.
What would be really great is if you could alter the smiles_50k.h5
file to include a clogp
column with this data indexed to the right rows, instead of including this separately as a txt file. Then it would work with the --property_column
parameter on the preprocessing script and the rest of the tooling here.
I am curious about if it will be an issue to include data generated from ChemAxon. I mean any license issues.