Daniel Himmelstein

Results 584 comments of Daniel Himmelstein

> a simple initial interface is optimal I went with a simple solution. In dhimmel/cancer-data@ffe66ab26000379adcd7138b8ff39920d4692ef1, I retained only red and blue mutations (according to Xena), meaning orange and green mutations...

Output looks like this: | acronym | entrez_gene_id | patients | tumor_mean | normal_mean | mean_diff | t_stat | mlog10_p_value | symbol | | --- | --- | --- |...

@gwaygenomics what do you think of the plot in [`5.differential-expression.ipynb`](https://github.com/dhimmel/cancer-data/blob/ccbba2229de385f6cbcdc6aacbceb9ca4f93b6ef/5.differential-expression.ipynb)? In other words, do you see biology within? ![cancer-by-nmf-component](https://cloud.githubusercontent.com/assets/1117703/19245746/cc84b03c-8ef0-11e6-84ed-106ad61c4953.png) The heatmap shows differential expression signatures for each cancer. Genes were...

@gwaygenomics we're using a paired t-test, about which [the following has been said](http://support.minitab.com/en-us/minitab/17/topic-library/basic-statistics-and-graphs/hypothesis-tests/tests-of-means/why-use-paired-t/): > The paired t-test calculates the difference within each before-and-after pair of measurements, determines the mean of...

@binaypanda, just want to make sure you know we're not the creators of the TCGA datasets. We use Xena Browser data. So I will try to comment on your questions,...

> Do you anticipate complete randomness in the subselection (i.e. totally user selected) Yes we should be prepared to serve any combination of rows. > Perhaps a cached or database...

Note that the `gene_mutation` column should be Entrez GeneIDs.

This pull request creates a mapping from gene (as Entrez GeneIDs) to the list of mutated samples (as TCGA sample IDs). This dictionary/JSON obejct is called `gene_to_mutated_samples`. As a JSON...

@patrick-miller great issue. It's something we should consider to avoid [false research findings](https://doi.org/10.1371/journal.pmed.0020124 "Why Most Published Research Findings Are False"). You touch on several partial solutions, which I think all...

> It is still not clear why few oncogenes produced such bad results. I'm happy to see mediocre results for modeling some mutations. With gene expression, universally positive results are...