PyHealth icon indicating copy to clipboard operation
PyHealth copied to clipboard

Add GDSC Dataset

Open anumala2 opened this issue 1 month ago • 0 comments

Purpose

This pull request aims to add the drug information dataset from the Genomics of Drug Sensitivity in Cancer to the pyhealth library. The GDSC drug_info table is a drug-centric metadata table that describes compounds screened across the Genomics of Drug Sensitivity in Cancer (GDSC) cell-line drug-sensitivity project. Typical columns include unique drug identifiers, canonical names, alternate names/synonyms, molecular or protein targets, higher-level pathways targeted, external chemical identifiers (e.g., PubChem CID), and bookkeeping counts such as sample sizes or number of experiments. The broader GDSC resource pairs these drug metadata with measured drug response (e.g., IC50) across hundreds to thousands of cancer cell lines, enabling pharmacogenomic analyses. This data is sourced from the extending-cadre repository.

anumala2 avatar Dec 08 '25 04:12 anumala2