rdkit-blog-fastpages
rdkit-blog-fastpages copied to clipboard
Demonstrating using chembl_downloader to get SDF path
As a follow-up to this twitter discussion
Really cool post, but I wish it were possible to directly re-run it (there are local file paths to ChEMBL data and no automation for downloading)
— Charles Tapley Hoyt (@cthoyt) December 20, 2021
Solution: I added the generation of the substructure library as an extra function in `chembl_downloader`:https://t.co/MgiChnrxaj
this PR makes a small change to automate the download of the ChEMBL SDF file using the lightweight chembl_downloader package. It chooses a file path that's deterministic on all systems so it can abstract away the need for a local file path for the ChEMBL SDF file.
It would also be possible to replace the whole line with gzip.open(sdf_path) as gz, Chem.ForwardSDMolSupplier(gz) as suppl: with with chembl_downloader.supplier(version="29") as suppl:, but I think that would be a bit too esoteric.
Hi @cthoyt, sorry I'm so slow to reply to this one; I missed the notification and am just now seeing it.
I'd be happy to mention using the chembl_downloader here (and agree that it could be useful to people who don't already have a local copy of the file downloaded), but would prefer to have that pulled out into a separate code block/section.