cancer-data icon indicating copy to clipboard operation
cancer-data copied to clipboard

exploring the data

Open ypar opened this issue 7 years ago • 3 comments

An issue has been raised in the meeting today regarding visualizations of the clinical data. Other data viz are also considered. However, more immediately, we need viz schemes of the clinical data for assessments and covariate selection.

ypar avatar Jul 26 '16 23:07 ypar

@Inquisitive-Geek is interested in resolving this issue

ypar avatar Jul 26 '16 23:07 ypar

Approach: seaborn/matplotlib in jupyter notebook

Potential visualizations:

Clinical:
  • Prevalence of tumor sites amongst samples
  • 'Time to event' distribution
  • Other variables of interest?
Sequencing (HiSeqV2):
  • Examine for batch effects? (potentially link to clinical matrix contributing variables)
Mutation:
  • Prevalence of mutation types
  • Number of mutations/sample ID
  • Most and least mutated genes

Feel free to add your own suggestions below!

clairemcleod avatar Jul 26 '16 23:07 clairemcleod

my only worry with seaborn is that it is very memory heavy. However, for the scale of data, I suppose it will be ok.

ypar avatar Jul 28 '16 04:07 ypar