pancancer icon indicating copy to clipboard operation
pancancer copied to clipboard

missing file: mutation__gene_x_ccle_cellline.gct from ras_cell_line_predictions.ipynb

Open nvk747 opened this issue 5 years ago • 4 comments

hi,
I was analyzing Ras_cell_line_predictions, the following file: mutation__gene_x_ccle_cellline.gct is missing from the data. checked the same in onco-gps-paper-analysis data folder [https://github.com/UCSD-CCAL/onco_gps_paper_analysis/tree/master/data], but could not find it. I have also looked for the file in CCLE datasets [https://portals.broadinstitute.org/ccle/data]. Let me know if any other place to download the same. regards, vijay

Refer to step:

Load CCLE Mutation Data

ccle_mut_file_name = os.path.join('..', '..', 'onco-gps-paper-analysis', 'data', 'mutation__gene_x_ccle_cellline.gct') ccle_all_mut_df = pd.read_table(ccle_mut_file_name, skiprows=2, index_col=0) ccle_all_mut_df.shape

nvk747 avatar Apr 17 '19 15:04 nvk747

Hi @nvk747 - sorry for the extremely late reply (nearly 1 year!) I am just seeing this now. Were you able to find the data? Still interested?

gwaybio avatar Apr 08 '20 15:04 gwaybio

hi @gwaygenomics, Thanks for responding after a long time, I haven't got that file, however, I used alterative file for the same(CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct from CCLE website). But I am not sure that this file is appropriate for this purpose. In any case, if you have the following file: mutation__gene_x_ccle_cellline.gct please let me know.

regards vijay

nvk747 avatar Apr 16 '20 14:04 nvk747

Hi Vijay,

I realized the answer to your question can be found here: UCSD-CCAL/onco_gps_paper_analysis#7

From @KwatME:

mutation__gene_x_ccle_cellline.csv is inside of the zip file spro download gets. Running the same notebook 1 Set up data.ipynb should unzip it. I've attached the line that does the unzip below. Alternatively, if you want to download the zip file yourself and unzip it, here is the link.

Hope this helps!

gwaybio avatar Apr 26 '20 10:04 gwaybio

Thanks for the pin @gwaygenomics.

@nvk747 , using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct is appropriate (this file is used not for the main modeling but for annotating the patterns found in the analysis.)

Let me know if you have any other questions.

KwatMDPhD avatar Apr 29 '20 16:04 KwatMDPhD