kg-microbe
kg-microbe copied to clipboard
ingest gene knockout data from LBL microbial fitness experiments
All of the data is here (84G total): http://genomics.lbl.gov/supplemental/bigfit/
The numerical relative growth data would have to be converted - growth vs no growth, via eg thresholding.
Just taking the first organism as an example: http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/
On the organism page, under 'Genes' the 'Specific phenotypes' link gives a table of most significant phenotype per gene for this KO dataset: http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/specific_phenotypes and this file can serve as the primary data source. These columns:
sysName desc name lrn t Group Condition_1 Concentration_1 Units_1
provide the following data:
gene name description internal name log ratio normalized t-statistic condition group condition name concentration unit
For reference under 'Genes' the 'Gene fitness' link gives a full table of relative fitness values: http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/fit_logratios_good.tab The y-axis labels are 'locusId' which are gene ids and the x-axis labels are condition (sample) ids including a text description.
There is additional data on each condition on the organism page under 'Tables' then 'Experiments' then 'Detailed metadata for experiments': http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/expsUsed
A basic ingest of this data would model as mutant alleles or a gene-condition relation indicating that this gene X is essential for growth in condition Y. As key supporting data the gene annotations should also be ingested: http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/fit_genes.tab with the caveat that these are 'free text' annotations so may require standardization.
Further ingests could include:
- In addition, the expsUsed table could be treated as a Sample metadata table and run through the usual NLP process.
- Significance values for each fitness value eg: http://genomics.lbl.gov/supplemental/bigfit/html/acidovorax_3H11/fit_t.tab