GenomicDataCommons icon indicating copy to clipboard operation
GenomicDataCommons copied to clipboard

extract clinical data from previous research

Open bioinfo-dirty-jobs opened this issue 7 years ago • 4 comments

I want to download all the clinical data from the rnaseq data selected:


expands = c("diagnoses","annotations",
            "demographic","exposures")
clinResults = cases() %>%

  GenomicDataCommons::select(filter( ~ cases.project.project_id == 'TCGA-OV' &
                                       type == 'gene_expression' &
                                       analysis.workflow_type == 'HTSeq - Counts') ) %>%
  GenomicDataCommons::expand(expands) %>%
  results(size=300)
str(clinResults,list.len=10)
write.table(clinResults,"Clinical_results.csv",sep="\t",row.names = FALS
```E)

bioinfo-dirty-jobs avatar May 16 '18 14:05 bioinfo-dirty-jobs

You'll need to do this in two steps.

  1. Do your files query and include the files.cases.case_id.
  2. Use the case_ids from query 1 as input to gdc_clinical.

Give it a try and let me know if you need more direction. Great question!

seandavi avatar May 16 '18 18:05 seandavi

@seandavi Thanks so much.... I try to figure out.. .. but I miss something Here you have what I found

q = cases() %>%
    filter(~ project.project_id=='TCGA-OV'  &
             files.analysis.workflow_type == 'HTSeq - FPKM-UQ')
q %>% count()





file_ids = q %>% facet('files.cases.case_id') %>% response_all() %>%
  ids()

So I suppose I have in file_ids How can retrive all the diagnosis data... If I have the bcr_patient_uuid how can download the expression data? Could you please make me some example? thanks so much for the help and patience

bioinfo-dirty-jobs avatar May 17 '18 10:05 bioinfo-dirty-jobs

I'll write something up, but it may take me a few days--sorry for the delay. I really appreciate you working through this with us.

seandavi avatar May 18 '18 17:05 seandavi

Dear @seandavi I still try to resolve the problem... but I miss something. So I found on cases I found this: grep('files.cases.case_id',available_fields('cases'),value=TRUE) but not on the expand field

q = cases() %>%
  GenomicDataCommons::filter(~ project.project_id=='TCGA-OV'  &
                               files.analysis.workflow_type == 'HTSeq - FPKM-UQ') %>% 
  GenomicDataCommons::expand("diagnoses") %>% facet()
q %>% results()

bioinfo-dirty-jobs avatar May 24 '18 11:05 bioinfo-dirty-jobs