pronto icon indicating copy to clipboard operation
pronto copied to clipboard

How to get cross-references descriptions?

Open CarMoreno opened this issue 10 months ago • 1 comments

Hi there! I'm currently trying to retrieve cross-reference descriptions from the ChEBI Ontology compounds. For example:

<owl:Class rdf:about="http://purl.obolibrary.org/obo/CHEBI_4508">
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/CHEBI_26218"/>
        <rdfs:subClassOf>
            <owl:Restriction>
                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/>
                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/CHEBI_48311"/>
            </owl:Restriction>
        </rdfs:subClassOf>
        <obo:IAO_0000115 rdf:datatype="http://www.w3.org/2001/XMLSchema#string">The potassium salt of diclofenac.</obo:IAO_0000115>
        <chebi:charge rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">0</chebi:charge>
        <chebi:formula rdf:datatype="http://www.w3.org/2001/XMLSchema#string">C14H10Cl2NO2.K</chebi:formula>
        <chebi:inchi rdf:datatype="http://www.w3.org/2001/XMLSchema#string">InChI=1S/C14H11Cl2NO2.K/c15-10-5-3-6-11(16)14(10)17-12-7-2-1-4-9(12)8-13(18)19;/h1-7,17H,8H2,(H,18,19);/q;+1/p-1</chebi:inchi>
        <chebi:inchikey rdf:datatype="http://www.w3.org/2001/XMLSchema#string">KXZOIWWTXOCYKR-UHFFFAOYSA-M</chebi:inchikey>
        <chebi:mass rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">334.243</chebi:mass>
        <chebi:monoisotopicmass rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">332.97257</chebi:monoisotopicmass>
        <chebi:smiles rdf:datatype="http://www.w3.org/2001/XMLSchema#string">O=C([O-])Cc1ccccc1Nc1c(Cl)cccc1Cl.[K+]</chebi:smiles>
        <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Beilstein:6625757</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">CAS:15307-81-0</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">DrugBank:DB00586</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">KEGG:D00903</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">PMID:1502708</oboInOwl:hasDbXref>
        <oboInOwl:hasExactSynonym rdf:datatype="http://www.w3.org/2001/XMLSchema#string">potassium {2-[(2,6-dichlorophenyl)amino]phenyl}acetate</oboInOwl:hasExactSynonym>
        <oboInOwl:hasOBONamespace rdf:datatype="http://www.w3.org/2001/XMLSchema#string">chebi_ontology</oboInOwl:hasOBONamespace>
        <oboInOwl:hasRelatedSynonym rdf:datatype="http://www.w3.org/2001/XMLSchema#string">2-((2,6-dichlorophenyl)amino)benzeneacetic acid, monopotassium salt</oboInOwl:hasRelatedSynonym>
        <oboInOwl:hasRelatedSynonym rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Cataflam</oboInOwl:hasRelatedSynonym>
        <oboInOwl:id rdf:datatype="http://www.w3.org/2001/XMLSchema#string">CHEBI:4508</oboInOwl:id>
        <oboInOwl:inSubset rdf:resource="http://purl.obolibrary.org/obo/chebi/3_STAR"/>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">diclofenac potassium</rdfs:label>
    </owl:Class>
    <owl:Axiom>
        <owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/CHEBI_4508"/>
        <owl:annotatedProperty rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasDbXref"/>
        <owl:annotatedTarget rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Beilstein:6625757</owl:annotatedTarget>
        <oboInOwl:source rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Beilstein</oboInOwl:source>
    </owl:Axiom>
    <owl:Axiom>
        <owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/CHEBI_4508"/>
        <owl:annotatedProperty rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasDbXref"/>
        <owl:annotatedTarget rdf:datatype="http://www.w3.org/2001/XMLSchema#string">CAS:15307-81-0</owl:annotatedTarget>
        <oboInOwl:source rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ChemIDplus</oboInOwl:source>
    </owl:Axiom>
    <owl:Axiom>
        <owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/CHEBI_4508"/>
        <owl:annotatedProperty rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasDbXref"/>
        <owl:annotatedTarget rdf:datatype="http://www.w3.org/2001/XMLSchema#string">PMID:1502708</owl:annotatedTarget>
        <oboInOwl:source rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Europe PMC</oboInOwl:source>
    </owl:Axiom>

I am expecting something like:

frozenset({
     Xref('ChemIDplus', 'CAS:15307-81-0'), 
     Xref('Europe PMC', 'PMID:1502708'), 
     Xref('Beilstein', 'Beilstein:6625757')
})

However, I am getting:

frozenset({
     Xref('CAS:15307-81-0'), 
     Xref('PMID:1502708'), 
     Xref('Beilstein:6625757')
})

It seems like Pronto might be having some trouble understanding the ontology cross-reference structure. I'm wondering if I might be doing something wrong? Could you please guide me on how to retrieve descriptions? Your assistance would be greatly appreciated! Thank you in advance for your help!

CarMoreno avatar Mar 28 '24 14:03 CarMoreno

Hi @CarMoreno,

The problem here is that you are not trying to retrieve descriptions, which would be inside rdfs:label elements of each Axiom, but sources (inside oboInOwl:source elements). This is not strictly supported in OBO files (which pronto is aiming at supporting), as this can only be listed in OBO line qualifiers, which are optional and ignored by some parsers. I don't really have a solution to provide at the moment.

althonos avatar Apr 03 '24 08:04 althonos