TCGAbiolinks
TCGAbiolinks copied to clipboard
Downloading Mutation data (hg19) for a cancer type
Hi
I am trying to get Mutation data (hg19) for ESCA cancer in maf format; I have done so but I am getting error
Can you help me please
> query.maf.hg19 <- GDCquery(project = "TCGA-ESCA",
+ data.category = "Simple nucleotide variation",
+ data.type = "Simple somatic mutation",
+ access = "open",
+ legacy = TRUE)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-ESCA
--------------------
oo Filtering results
--------------------
ooo By access
ooo By data.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> View(query.maf.hg19[[1]][[1]])
> query.maf.hg19 <- GDCquery(project = "TCGA-ESCA",
+ data.category = "Simple nucleotide variation",
+ data.type = "Simple somatic mutation",
+ access = "open",
+ file.type = "bcgsc.ca_ESCA.IlluminaHiSeq_DNASeq.1.somatic.maf ",
+ legacy = TRUE)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-ESCA
--------------------
oo Filtering results
--------------------
ooo By access
ooo By data.type
ooo By file.type
|Files |
|:----------------------------------------------------------------------|
|bcgsc.ca_ESCA.IlluminaHiSeq_DNASeq.1.somatic.maf |
|genome.wustl.edu_ESCA.IlluminaHiSeq_DNASeq_automated.1.1.0.somatic.maf |
|gsc_ESCA_pairs.aggregated.capture.tcga.uuid.automated.somatic.maf |
|hgsc.bcm.edu_ESCA.IlluminaGA_DNASeq.1.somatic.maf |
|ucsc.edu_ESCA.IlluminaGA_DNASeq_automated.Level_2.1.0.0.somatic.maf |
|NA |
|NA |
|NA |
|NA |
|NA |
Error in GDCquery(project = "TCGA-ESCA", data.category = "Simple nucleotide variation", :
We were not able to filter using this file type. Examples of available files are above. Please check the vignette for possible entries
I'll check your code soon.
But maybe you want to check this https://gdc.cancer.gov/about-data/publications/mc3-2017 for hg19 mutations.
Hi,
> query.maf.hg19 <- GDCquery(project = "TCGA-ESCA",
+ data.category = "Simple nucleotide variation",
+ data.type = "Simple somatic mutation",
+ access = "open",
+ file.type = "bcgsc.ca_ESCA.IlluminaHiSeq_DNASeq.1.somatic.maf ",
+ legacy = TRUE)
Your file.type has empty characters in the end, if you remove them it should work.
query.maf.hg19 <- GDCquery(project = "TCGA-ESCA",
data.category = "Simple nucleotide variation",
data.type = "Simple somatic mutation",
access = "open",
file.type = "bcgsc.ca_ESCA.IlluminaHiSeq_DNASeq.1.somatic.maf",
legacy = TRUE)
Sorry what you mean by file.type has empty characters in the end ?
Thank you
