cancer-data icon indicating copy to clipboard operation
cancer-data copied to clipboard

TCGA data acquisition and processing for Project Cognoma

Results 13 cancer-data issues
Sort by recently updated
recently updated
newest added

New, publicly available dataset of 11,078 RNAseq + clinical childhood cancer tumors. [Xena data](https://xenabrowser.net/datapages/?hub=https://treehouse.xenahubs.net:443) [Blog Post](http://www.rna-seqblog.com/new-tumor-database-of-rna-seq-data-deployed-to-battle-childhood-cancer-at-uc-santa-cruz/?utm_campaign=shareaholic&utm_medium=twitter&utm_source=socialnetwork) This will open up a lot of analysis opportunities - exciting it is now...

Stumbled upon [snaptron](https://twitter.com/GreeneScientist/status/832635971883593728) today and eventually found my way to this [resource](http://snaptron.cs.jhu.edu/data/tcga/). There are many variables curated here measured on each sample (in `samples.tsv`) including treatment (both specific therapeutic agent,...

which column in the clinical_data should i consider to know if the tumor has recurred or not? does _RFS_IND=1 mean definitely recurred? how do i know if the tumor is...

Note to untrack `data/complete/differential-expression.tsv.bz2` before merging. This is something that @ksimeono -- a cancer biologist -- was interested it. It's potentially out of scope for Cognoma, but I thought it's...

At the 8/23 meetup, @dhimmel expressed interest in incorporating metabolic pathway information by combining the dataset that we have and the hetnet database that was described at the first meetup....

task

I have had this issue in the past ([see zenodo file](https://zenodo.org/deposit/126065/)) and it looks like the current PANCAN_mutation file from xena has less samples and less columns than a previous...

help wanted

There are a number of questions around how best to represent various items that are important for sample selection. Can someone help out the [django-cognoma](https://github.com/cognoma/django-cognoma) team to specify the best...

Currently, we're storing our datasets (which are matrices) as compressed TSVs which are great for long-term interoperable storage. However, we'd love a way to lookup specific rows and columns without...

task

In speaking with a cancer biologist and collaborator about cognoma it was discovered that a huge win we could relatively easily deliver is classification performance (or classification scores) across different...

enhancement