qca-dataset-submission icon indicating copy to clipboard operation
qca-dataset-submission copied to clipboard

Potential dataset: Enamine REAL

Open jchodera opened this issue 6 years ago • 2 comments

There is enormous interest in gigadocking and free energy calculations with Enamine REAL, which is a large virtual purchasable library of up to 11 billion molecules at current count.

There are only ~73K in the "building blocks" subset, which might provide good coverage of much of the chemistry in the database.

Other downloadable Enamine REAL subsets can be found on this page.

jchodera avatar Jul 02 '19 23:07 jchodera

On closer inspection, it may be a better idea to not use the "building blocks" but instead fragment larger purchaseable compound sets, eliminating duplicates.

jchodera avatar Jul 03 '19 16:07 jchodera

If I were picking compound subsets, I'd tackle the following in order:

jchodera avatar Sep 05 '20 18:09 jchodera