COB icon indicating copy to clipboard operation
COB copied to clipboard

Query for taxa used in OBO

Open jamesaoverton opened this issue 5 years ago • 7 comments

It would be good to know what taxa are used in OBO, and have a script to build a little tree of them.

I used the Ontobee SPARQL endpoint http://sparql.hegroup.org/sparql/ to run variations on this query:

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix owl: <http://www.w3.org/2002/07/owl#>
SELECT distinct ?graph_iri, ?taxon, ?label
WHERE { 
  GRAPH ?graph_uri {
    ?s owl:onProperty <http://purl.obolibrary.org/obo/RO_0002162> ; # in taxon
       owl:someValuesFrom ?taxon . # some X
    ?taxon rdfs:label ?label .
  }
}

There aren't that many "'in taxon' some X" results. There are more results for "'only in taxon' some X". The vast majority of these are from PR, and the majority of those from viruses and bacteria. There are about 700 distinct taxa with PR, and about 50 without PR. That list of 50 is interesting.

Rough analysis here:

https://docs.google.com/spreadsheets/d/16D7l0G-DL1Liv7yYFYVBEgRNpuNepCQCZoQTLYXXorA

jamesaoverton avatar Mar 19 '20 13:03 jamesaoverton

Why not include only in taxon usages in the list? There are taxa used in GO that aren't in your current list.

balhoff avatar Mar 19 '20 13:03 balhoff

Jim: They are on a different tab

On Thu, Mar 19, 2020 at 6:52 AM Jim Balhoff [email protected] wrote:

Why not include only in taxon usages in the list? There are taxa used in GO that aren't in your current list.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/COB/issues/60#issuecomment-601190617, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IQIQXA3NGWMTQH3W33RIIPRXANCNFSM4LPJIVTA .

-- Bjoern Peters Professor La Jolla Institute for Allergy and Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

bpeters42 avatar Mar 19 '20 13:03 bpeters42

I did check for 'only in taxon' using a variation on that query, with results in the Google Sheet, but you're right that I somehow missed GO. I'll make sure I get GO when I run this again.

This was quick-and-dirty, but I was asked to follow up later. So I made this issue mostly to remind myself.

jamesaoverton avatar Mar 19 '20 13:03 jamesaoverton

James: The 'unique' tab still has the issue that you are reporting taxa multiple times if they are used with different labels. For example: [image: image.png]

On Thu, Mar 19, 2020 at 6:56 AM James A. Overton [email protected] wrote:

I did check for 'only in taxon' using a variation on that query, with results in the Google Sheet, but you're right that I somehow missed GO. I'll make sure I get GO when I run this again.

This was quick-and-dirty, but I was asked to follow up later. So I made this issue mostly to remind myself.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/COB/issues/60#issuecomment-601192970, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2ITUO67EYNCU3A2Z7E3RIIQCFANCNFSM4LPJIVTA .

-- Bjoern Peters Professor La Jolla Institute for Allergy and Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

bpeters42 avatar Mar 19 '20 14:03 bpeters42

You're right @bpeters42, I haven't updated the sheet since we discussed it. I'm just putting the task here, where I'll remember it.

jamesaoverton avatar Mar 19 '20 14:03 jamesaoverton

Jim: They are on a different tab

Thank you @bpeters42, I totally missed that.

balhoff avatar Mar 19 '20 14:03 balhoff

I have a prototype here:

  • tree view: https://xcl.ontodev.com/branches/organism-specific/views/build/obo-taxonomy-tree.html?text=Vertebrata%20%3Cvertebrates%3E
  • code: https://github.com/jamesaoverton/experimental-cell-ontology/blob/organism-specific/Makefile#L200

jamesaoverton avatar Apr 23 '20 21:04 jamesaoverton