datahub
datahub copied to clipboard
2023 CCLE study (DepMap) release
What?
Adding the CCLE 2023 study release
checks
For all pull requests:
- [ ] Passes validation
For a new study (in addition to above):
- [ ] Does study name and study ID follow our convention? e.g. Tumor_Type (Institue, Journal Year); brca_mskcc_2015
- [ ] is study meta data complete? e.g. pmid, group of PUBLIC
- [ ] were all samples profiled with WES/WGS? If not, is gene panel file curated?
- [ ] are oncotree codes of all samples curated; Cancer Type and Cancer Type Detailed needs to be added in addition to Oncotree Code
- [ ] clinical sample and patient data with meta files
- [ ] mutations data with meta files
- [ ] MAF is based on hg19
- [ ] MAF with 2 isoforms: uniprot and mskcc
- [ ] CNA data with meta files
- [ ] CNA segment data with meta files
- [ ] Expression data including z-scores with meta files
- [ ] Case-lists for all profiles.
- [ ] Manual checking (Niki or JJ): Triage or private Portal link here
- [x] Meta study needs to be redone to follow our convention.
- [x] DepMap needs to be added as a source
- [x] Can we distinguish between WGS and WES samples in the cohort?
- [x] "" marks in the clinical file
- [x] We prolly need to identify what are important attributes and the change the priority accordingly for those.
- [x] Are we pulling the right mutation data? https://triage.cbioportal.mskcc.org/patient?studyId=ccle_broad_2023&sampleId=ACH-000001 compared to what is in the depmap portal nothing is matching. Mutation data was retrieved from "OmicsSomaticMutationsMAFProfile.maf" file. All files from the mutations pipeline containing information on all the somatic point mutations and indels called in the DepMap cell lines contain the same variants. Followed up with authors -> Author from genomic team is out for the rest of the year..
- [x] For SV can we pull chr numbers and coordinates? Chromosome number and position provided
- [x] Is CNA segment data correct?
Yes, I only mapped the profile ID provided in the DepMap downloaded file ‘OmicsCNSegmentsProfile’ to the sample ID to avoid confusion.