scholia icon indicating copy to clipboard operation
scholia copied to clipboard

sort parent taxons in order, from highest to lowest

Open egonw opened this issue 8 years ago • 4 comments

... but need to figure out first how to get this data from Wikidata...

egonw avatar Jan 09 '18 16:01 egonw

for Homo sapiens (Q15978631) it gives 30 ranks in total.

The order of these should be displayed ideally either highest to lowest or lowest to highest (in rank). Thus the order should be (lowest to highest):

Homo	genus
Hominina	subtribe
Hominin	tribe
Homininae	subfamily
Hominidae	family
Hominoidea	superfamily
Catarrhini	parvorder
Simiiformes	infraorder
Haplorrhini	suborder
Primates	order
Primatomorpha	mirorder
Euarchontoglires	superorder, grandorder
Holotheria	infraclass
Placentalia	infraclass, cohorte
Eutheria	subclass
Theria	subclass, supercohort
Boreosphenida	infraclass
Cladotheria	legion
Trechnotheria	superlegion
Theriiformes	subclass
mammal	class
Tetrapoda	superclass
Gnathostomata	infraphylum
Vertebrata	subphylum
Chordata	phylum
deuterostome	infrakingdom
Bilateria	subkingdom
animal	kingdom
Eukaryote	superkingdom
biota	superdomain

Clearly there are multiple terms of equivalent rank e.g. for infraclass. One needs to examine the phylogeny to determine exact order between them, but even roughly; domain, kingdom, phylum, class order would be nice even if the exact sort amongst same level rank names is not perfectly correct.

Source for infraclass and legion determination: https://en.wikipedia.org/wiki/Tribosphenida#phylogeny

rossmounce avatar Jan 09 '18 16:01 rossmounce

Yes, this is a mess. I suppose one way to handle it would be to make an explicite translation table in SPARQL between the taxon and a numerical value

fnielsen avatar Jan 09 '18 16:01 fnielsen

Today on the Wikidata Telegram a similar question came up and Andrew posted a query he had worked on for MPs. This led me to this query we can use to solve this issue:

# chains of direct male ancestors for an MP who were themselves MPs

SELECT DISTINCT ?taxon ?taxonLabel ?relative ?relativeLabel ?distance
WITH { 
  SELECT DISTINCT ?taxon ?taxonLabel ?relative ?relativeLabel (count(distinct ?rel) as ?distance) # find taxon, ancestor, count generations
  WHERE  { 
  values ?taxon { wd:Q133128 }
  
  values ?tx { wd:Q16521 }   # classes of MPs
  values ?tx2 { wd:Q16521 }  # classes of MPs
  values ?tx3 { wd:Q16521 }  # classes of MPs
  ?taxon wdt:P171* ?rel . ?rel wdt:P171+ ?relative.
  ?taxon wdt:P31 wd:Q16521 .
  ?rel wdt:P31 wd:Q16521 .
  ?relative wdt:P31 wd:Q16521 .
  } GROUP BY ?taxon ?taxonLabel ?relative ?relativeLabel 
} AS %MPS WHERE {
  INCLUDE %MPS
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} order by desc(?distance)

cc @Adafede

egonw avatar Jul 29 '23 20:07 egonw

@egonw I prefer wdt:P225 (taxon name) to labels, see:

SELECT DISTINCT ?taxon ?taxon_name ?relative ?relative_name (COUNT(DISTINCT ?rel) AS ?distance) WHERE {
  VALUES ?taxon {
    wd:Q133128
  }
  ?taxon (wdt:P171*) ?rel;
    wdt:P225 ?taxon_name.
  ?rel (wdt:P171+) ?relative.
  ?relative wdt:P225 ?relative_name.
}
GROUP BY ?taxon ?taxon_name ?relative ?relative_name
ORDER BY DESC (?distance)

Adafede avatar Jul 29 '23 20:07 Adafede