ray
ray copied to clipboard
Greater than 1 and negative proportions in gene ontology
Oberved
For example, in file
/rap/nne-790-ab/projects/FR-MicrobiomeSinges/20130215_nofail/RayMeta_Sample_JE1/BiologicalAbundances/_GeneOntology/biological_process.Depth=1.tsv
Identifier Name Proportion Observations Total
GO:0000003 reproduction 0.0123192 267993 21754013 GO:0002376 immune system process 0.749373 16301880 21754013 GO:0008152 metabolic process -72.3133 -1573105383 21754013 GO:0009987 cellular process 17.0055 369936933 21754013 GO:0022414 reproductive process 0.985647 21441772 21754013 GO:0022610 biological adhesion 0.468828 10198901 21754013 GO:0023052 signaling 0.265745 5781018 21754013 GO:0032501 multicellular organismal process 1.76106 38310165 21754013 GO:0032502 developmental process 17.6075 383034541 21754013
See Metabolic Process.
Expectations
Be able to get the "levelled" GO without weird numbers. In fact, what I need is a file like 0.Profile.GeneOntologyDomain=biological_process.tsv but with level information.
There was a discussion last week on the mailing list about proportions exceeding 100% for the files at specific depth.
http://permalink.gmane.org/gmane.science.biology.ray-genome-assembler/406
This happens because EMBL_CDS can annotate any kmer on several ontology terms that are all on the same path from the root to a particular term in the Gene Ontology directed acyclic graph.
For Gene Ontology, only Terms.xml, Terms.tsv are documented in Documentation/
In the Genome Biology paper, Terms.xml was used.
This ticket (and the issue reported on the mailing list) will likely be resolved by removing these files for particular levels because recursive counts are not useful here.
This means that if I parse Terms.xml I will get the correct information?
Terms.tsv contains the same information that is in Terms.xml.
These, however, are not recursive counts.
On 02/18/2013 10:07 AM, fredericraymond wrote:
This means that if I parse Terms.xml I will get the correct information?
— Reply to this email directly or view it on GitHub https://github.com/sebhtml/ray/issues/158#issuecomment-13725870.
Evaluation: 5 human-hours
This is presumably a WONTFIX, see above.