CEVOpen
CEVOpen copied to clipboard
AltLabel with different delimiter
We want to be able to get AltLabels from Wikidata without the default delimiter 'comma'. The default delimiter is a problem, especially with plantCompound
dictionary because the IUPAC names often have commas, which makes it harder for us to get the synonyms out.
I found a StackOverFlow solution for this. https://stackoverflow.com/questions/46850562/how-to-query-wikidata-for-also-known-as
Here is an example:
SELECT ?compound ?compoundLabel ?compoundDescription (GROUP_CONCAT(DISTINCT(?altLabel); separator = " | ") AS ?altLabel_list) WHERE {
VALUES ?compound {
wd:Q225543 wd:Q416114
}
OPTIONAL { ?compound skos:altLabel ?altLabel . FILTER (lang(?altLabel) = "en") }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
}
GROUP BY ?compound ?compoundLabel ?compoundDescription
By using concatenation, we can customize the delimiter. Here is how the SPARQL endpoint looks like:
<?xml version="1.0" encoding="UTF-8"?>
-<sparql xmlns="http://www.w3.org/2005/sparql-results#">
-<head>
<variable name="compound"/>
<variable name="compoundLabel"/>
<variable name="compoundDescription"/>
<variable name="altLabel_list"/>
</head>
-<results>
-<result>
-<binding name="compound">
<uri>http://www.wikidata.org/entity/Q225543</uri>
</binding>
-<binding name="compoundLabel">
<literal xml:lang="en">carvacrol</literal>
</binding>
-<binding name="compoundDescription">
<literal xml:lang="en">chemical compound</literal>
</binding>
-<binding name="altLabel_list">
<literal>Carvacrol | 1-Hydroxy-2-methyl-5-isopropylbenzene | 1-Methyl-2-hydroxy-4-isopropylbenzene | 2-Hydroxy-4-isopropyl-1-methylbenzene | 2-Hydroxy-p-cymene | 2-Hydroxycymene | 2-Methyl-5-(1-methylethyl)-Phenol | 2-Methyl-5-(1-methylethyl)phenol | 2-Methyl-5-isopropylphenol | 2-p-Cymenol | 3-Isopropyl-6-methyl-Phenol | 3-Isopropyl-6-methylphenol | 5-Isopropyl-2-methyl-Phenol | 5-Isopropyl-2-methylphenol | 5-Isopropyl-o-cresol | 6-Methyl-3-isopropylphenol | Antioxine | BENZENE,2-HYDROXY,4-ISOPROPYL,1-METHYL CARVACROL | Cymenol | Cymophenol | FEMA 2245 | Hydroxy-p-cymene | Isopropyl-O-cresol | Isothymol | Isothymol (=2-Isopropyl-4-methyl phenol) | Karvakrol | Methyl-5-(1-methylethyl)phenol | O-Thymol | Oxycymol | p-Cymen-2-ol | p-Cymene-2-ol | p-Mentha-1,3,5-trien-2-ol</literal>
</binding>
</result>
-<result>
-<binding name="compound">
<uri>http://www.wikidata.org/entity/Q416114</uri>
</binding>
-<binding name="compoundLabel">
<literal xml:lang="en">(+/-)-4-terpineol</literal>
</binding>
-<binding name="compoundDescription">
<literal xml:lang="en">chemical compound</literal>
</binding>
-<binding name="altLabel_list">
<literal>(+-)-p-Menth-1-en-4-ol | 1-Isopropyl-4-methyl-3-cyclohexen-1-ol | 1-isopropyl-4-methylcyclohex-3-en-1-ol | 1-Menthene-4-ol | 1-Methyl-4-isopropyl-1-cyclohexen-4-ol | 1-p-Menthen-4-ol | 1-para-Menthen-4-ol | 1-Terpinen-4-ol | 4-Carvomenthenol | 4-Methyl-1-(1-methylethyl)-3-cyclohexen-1-ol | 4-Methyl-1-isopropyl-3-cyclohexen-1-ol | 4-Terpineol | alpha -Terpinen-4-ol | alpha-terpinen-4-ol | FEMA 2248 | Origanol | p-Menth-1-en-4-ol | Terpene-4-ol | Terpin-4-en-1-ol | Terpinen-4-ol | Terpinene-4-ol | Terpinenol-4 | Terpineol-4</literal>
</binding>
</result>
</results>
</sparql>