Daniel Himmelstein
Daniel Himmelstein
> a simple initial interface is optimal I went with a simple solution. In dhimmel/cancer-data@ffe66ab26000379adcd7138b8ff39920d4692ef1, I retained only red and blue mutations (according to Xena), meaning orange and green mutations...
Output looks like this: | acronym | entrez_gene_id | patients | tumor_mean | normal_mean | mean_diff | t_stat | mlog10_p_value | symbol | | --- | --- | --- |...
@gwaygenomics what do you think of the plot in [`5.differential-expression.ipynb`](https://github.com/dhimmel/cancer-data/blob/ccbba2229de385f6cbcdc6aacbceb9ca4f93b6ef/5.differential-expression.ipynb)? In other words, do you see biology within? data:image/s3,"s3://crabby-images/5a14f/5a14f7cae347e83d3cab8bd31f18df1f46131950" alt="cancer-by-nmf-component" The heatmap shows differential expression signatures for each cancer. Genes were...
@gwaygenomics we're using a paired t-test, about which [the following has been said](http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/hypothesis-tests/tests-of-means/why-use-paired-t/): > The paired t-test calculates the difference within each before-and-after pair of measurements, determines the mean of...
@binaypanda, just want to make sure you know we're not the creators of the TCGA datasets. We use Xena Browser data. So I will try to comment on your questions,...
> Do you anticipate complete randomness in the subselection (i.e. totally user selected) Yes we should be prepared to serve any combination of rows. > Perhaps a cached or database...
Note that the `gene_mutation` column should be Entrez GeneIDs.
This pull request creates a mapping from gene (as Entrez GeneIDs) to the list of mutated samples (as TCGA sample IDs). This dictionary/JSON obejct is called `gene_to_mutated_samples`. As a JSON...
@patrick-miller great issue. It's something we should consider to avoid [false research findings](https://doi.org/10.1371/journal.pmed.0020124 "Why Most Published Research Findings Are False"). You touch on several partial solutions, which I think all...
> It is still not clear why few oncogenes produced such bad results. I'm happy to see mediocre results for modeling some mutations. With gene expression, universally positive results are...