chembl_webresource_client
chembl_webresource_client copied to clipboard
Extract ChEMBL version associated per compound
To whom it may concern,
is there a way to extract something like a publication/experimental date associated with a ChEMBL compound?
For benchmarking studies, we would like to use time-split validation. Therefore, it would be of tremendous help for us to get an association between ChEMBL version (or publication/experimental date) and a ChEMBL compound?
Best regards, Paul
Hi!
Thanks for the chembl_webresource_client
- we are using it at @volkamerlab a lot!
My question is related to @czodrowskilab's question here (I think), so I'll add it here.
Is there a field for the date when
- a compound or
- a bioactivity
was added to ChEMBL?
I know of the document_year
field, which I can fetch for bioactivities:
bioactivities = bioactivities_api.filter(
target_chembl_id='CHEMBL941').only(
'activity_id',
'document_year'
)
)
Given the ChEMBL database scheme, I guess it works as follows:
From doc_id
(field) in activities
(table) go to doc_id
(field) in docs
(table) and extract year
(field).
This probably also works similarly for compounds.
However, the document's year
- does not necessarily did to equal the bioactivity or compound deposition date in ChEMBL and
- does not need to be the same for compounds and bioactivities, right?
Thus, back to my question: What is the best way to filter compound or bioactivity entries in ChEMBL by their deposition date?
Thank you for your time.
I am realizing: The deposition date probably equal the ChEMBL version in which the compound/bioactivity was added.
Is there a way to access the version? In the ChEMBL scheme, the version table seems not to be connected with any other table: https://www.ebi.ac.uk/chembl/db_schema
Update on this matter: The ChEMBL support team (https://www.ebi.ac.uk/support/) let me know that it is currently not possible to extract data from the web services (or the interface) for a previous ChEMBL release.
Hi, sorry for not having replied this before. As Dominique says, this is currently not possible but we are working towards supporting it on future ChEMBL versions.
Dear ChEMBL team,
Thank you for chembl_webresource_client
. Is there any update on whether the version could be directly/indirectly linked to an activity record?
Like @czodrowskilab, I am also interested in splitting the activity data for a target by chembl version in which it first appeared (to mimic time-split).
Thanks, Vishal
We are working on implementing time stamps for deposited data sets. We are exploring how to deal with updates of documents now. Tentatively, we'll release this with CHEMBL 34.
However, time stamps for data sets are already available ( and have always been) via the VERSION
table which includes CREATION_DATE
for every ChEMBL release. By querying for which release a document has been added, you'll get your time stamps. After we've solved the question about updated documents, we plan to make that information more easily accessible, likely via a new table.