abagen icon indicating copy to clipboard operation
abagen copied to clipboard

Add sets of pre-determined gene groups

Open rmarkello opened this issue 5 years ago • 1 comments

The issue

Burt et al., 2018 performed their analyses using pre-determined groups of genes (i.e., "brain-specific," "neuron- and oligodendrocyte-specific," etc.). It would be nice to incorporate these "standard" sets into abagen so that users don't have to dig through the supplementary materials to get them. (Also, the excel file that contains the gene groups for the aforementioned article auto-converted some of the gene symbols to dates, so converting those back for users would be helpful!)

Another potential database for gene groups would be the Molecular Signature database that include groupings for different biological processes.

Proposed solution

Transliterate gene symbols for different groupings into separate CSV files and include them with the abagen distribution in abagen/data/gene_sets/.

Also add a new function (abagen.get_gene_group(group)) to the codebase, where group can be any of the available groups (e.g., 'brain', 'neuron', 'oligodendrocyte', 'synaptome', 'layers'). This function would simply load the queried set from the distribution and return a list of gene symbols!

rmarkello avatar Jun 20 '19 20:06 rmarkello

The Burt et al., 2018 gene groups have been added, but including the Molecular Signature database is still open!

rmarkello avatar Aug 26 '19 14:08 rmarkello