fibo icon indicating copy to clipboard operation
fibo copied to clipboard

sameAs links between currency-USD and currency-USDollar

Open VladimirAlexiev opened this issue 2 years ago • 10 comments

(was: fibo-fnd-acc-4217:ISO4217-CodeSet individual URLs should use codes not names)

2022Q2 file FND/Accounting/ISO4217-CurrencyCodes.rdf has this:

fibo-fnd-acc-4217:USD
        rdf:type           fibo-fnd-acc-cur:CurrencyIdentifier , owl:NamedIndividual ;
        lcc-lr:hasTag      "USD" ;
        rdfs:label         "USD" ;
        lcc-lr:denotes     fibo-fnd-acc-4217:USDollar ;
        lcc-lr:identifies  fibo-fnd-acc-4217:USDollar ;

fibo-fnd-acc-4217:USDollar
        rdf:type                       fibo-fnd-acc-cur:Currency , owl:NamedIndividual ;
        lcc-lr:hasName                 "US Dollar" .
        rdfs:label                     "US Dollar" ;
        fibo-fnd-acc-cur:hasNumericCode "840" ;
        lcc-cr:isUsedBy                lcc-3166-1:VirginIslandsBritish ...
  • Use Case:
    • I am converting a dataset that uses currency codes (eg Crunchbase #1808) to RDF
    • I want to reuse to the above individuals, rather than making my own
  • Problem: I cannot compute (predict) the name USDollar from the code USD
    • fibo-fnd-acc-4217:USD doesn't have that problem, but financial data relates to Currency not CurrencyIdentifier
    • Most RDFization tools cannot do an RDF lookup while converting tabular data to RDF (eg RML, R2RML, TARQL...)
    • (In contrast, Ontotext Refine can do an RDF lookup, see https://github.com/VladimirAlexiev/rdf2rml/blob/master/doc/rdf2sparql.pod#preprocessor-macros CB_AGENT_URL() and then https://github.com/VladimirAlexiev/rdf2rml/blob/master/doc/rdf2sparql.pod#generated-sparql)
  • Suggestion: change the above URLs to fibo-fnd-acc-4217:USD-code and fibo-fnd-acc-4217:USD-currency respectively

VladimirAlexiev avatar Aug 17 '22 08:08 VladimirAlexiev

@VladimirAlexiev You should be able to use the inverse query to come up with the name of the currency from the code. The code identifies the currency. When we integrate the OMG Commons ontology library, identifies will be a subproperty of denotes and the redundancy between the two properties will be eliminated. If you use "isIdentifiedBy" in an inverse SPARQL query, though, it all works properly.

We have had other users at banks request the names of the currencies as we have them. We have used a pattern for the MIC codes where we have Exchange-code and MIC-code mainly because of (1) the volume of individuals and (2) the difficulty in differentiating the names using algorithms (many are duplicated in the data from ISO). But our policy is not to use codes as you suggest when the number of individuals is small and when the names are well-known and requested by members. The policy stems from the use of the ontology for vocabulary and business glossary purposes in addition to for IT purposes, if that helps.

ElisaKendall avatar Aug 17 '22 14:08 ElisaKendall

Of course I can do it with a query, but most ETL tools cannot do a query (or even simple RDF pattern) while performing ETL.

If banks have requested URLs that use English phrases, that's a valid reason to keep it as it is. However, I think that a vocabulary or business glossary should use (and primarily display) rdfs:label, not the URL.

VladimirAlexiev avatar Aug 18 '22 13:08 VladimirAlexiev

One option would be to add an optional ontology that uses a pattern for what we have called "adjunct" URLs for country codes for a similar purpose - with OWL sameAs to the primary one that uses the name. The adjuct ontology would not start with the code however, it would have Currency-, and CurrencyCode- as the elements, so, for example, Currency-USD would have an owl:sameAs USDollar, and CurrencyCode-USD would have an owl:sameAs USD ... would something like that address your requirement?

ElisaKendall avatar Aug 18 '22 17:08 ElisaKendall

Currency-USD and CurrencyCode-USD are great URLs.

owl:sameAs works perfectly in GraphDB due to https://graphdb.ontotext.com/documentation/10.0/sameas-optimisation.html. Eg we used it in a recent semantization of NIH grants (cc @nataschake, @stefangn98; continuing example from https://github.com/w3c/sparql-12/issues/14) where a short and a long code is used for certain kinds of organizations:

<grant/123> a Grant; 
  funding <grant/123/funding/NIH>, <grant/123/funding/CDC>;
  administeredBy <funder/CD>.

<grant/123/funding/NIH> a Funding; funder <funder/NIH>; amount 10000.
<grant/123/funding/CDC> a Funding; funder <funder/CDC>; amount 12000.

<funder/CDC> a Funder;
  idAdministrator "CD"; # only "administeredBy" entities have this short code
  idFunder "CDC";       # all funders have this long code
  name "Centers for Disease Control".

<funder/CD> owl:sameAs <funder/CDC>.

However, owl:sameAs doesn't work so smoothly in other repositories. So if you include it, please use a separate file.

VladimirAlexiev avatar Aug 19 '22 07:08 VladimirAlexiev

ISO3166-1-CountryCodes-Adjunct uses sameAs (pointed in https://github.com/ga-group/iso10383/issues/6), so @mereolog can you do the same for FIBO currencies?

  • change the URLs to look like @ElisaKendall suggested: fibo-fnd-acc-4217:CurrencyCode-USD, fibo-fnd-acc-4217:Currency-USDollar
  • run this to generate the separate coreferencing file, which will include
    • eg fibo-fnd-acc-4217:Currency-USDollar owl:sameAs fibo-fnd-acc-4217:Currency-USD
construct {
  ?curr owl:sameAs ?currAsCode
} where {
  ?code a fibo-fnd-acc-cur:CurrencyIdentifier; lcc-lr:hasTag ?c; lcc-lr:identifies ?curr.
  bind(iri(concat(str(fibo-fnd-acc-4217:),"Currency-",?c)) as ?currAsCode)
}

VladimirAlexiev avatar Aug 24 '22 11:08 VladimirAlexiev

@VladimirAlexiev @mereolog We will add a separate ontology that includes the adjuncts as we did for LCC, hopefully soon. The existing URLs will remain, since a number of banks are already using them, but a second ontology with these other codes will be added that includes the owl:sameAs references. I'll have to talk with Pawel (@mereolog) about the timing for scripting this and adding the ontology, but hopefully we can do this relatively soon.

ElisaKendall avatar Aug 26 '22 23:08 ElisaKendall

@ElisaKendall can we close this?

mereolog avatar Mar 25 '24 14:03 mereolog

@merelog - I would rather leave it open for now. We haven't done the work, but that doesn't mean that we shouldn't. It has not been an issue for our FIBO other users, but some folks that want to use currency codes for other applications who are more software oriented might want something similar.

ElisaKendall avatar Mar 26 '24 15:03 ElisaKendall

For reference, I wrote this

@TechReport{Alexiev-Crunchbase-Fibo-2023,
  author       = {Vladimir Alexiev},
  title        = {{Exploring FIBO Complexity With Crunchbase: Representing Crunchbase IPOs in FIBO}},
  month        = sep,
  year         = 2023,
  url          = {https://rawgit2.com/VladimirAlexiev/crunchbase-fibo/main/README.html},
  url_Github   = {https://github.com/VladimirAlexiev/crunchbase-fibo/},
  keywords     = {fintech, Crunchbase, ontologies, semantic modeling, Initial Public Offering, IPO, Financial Industry Business Ontology, FIBO},
  abstract     = {The Financial Industry Business Ontology (FIBO) by the Enterprise Data Management Council (EDMC) is a family of ontologies and a reference model for representing data in the financial world using semantic technologies. It is used in fintech Knowledge Graph (KG) projects because it offers a comprehensive and principled approach to representing financial data, and a wide set of predefined models that can be used to implement data harmonization and financial data integration. The 2022Q2 FIBO release consists of 290 ontologies using 380 prefixes that cover topics such as legal entities, contracts, agency, trusts, regulators, securities, loans, derivatives, etc. FIBO's reach and flexible ontological approach allow the integration of a wide variety of financial data, but it comes at the price of more complex representation. Crunchbase (CB) is a well-known dataset by TechCrunch that includes companies, key people, funding rounds, acquisitions, Initial Public Offerings (IPOs), etc. It has about 2M companies with a good mix of established enterprises (including 47k public companies), mid-range companies and startups. We (Ontotext and other Wikidata contributors) have matched 72k CB companies to Wikidata, see this query. I explore the representation of Crunchbase data (more specifically IPOs) in FIBO and compare it to the simplest possible semantic representation. I therefore illustrate the complexity of FIBO, and explain its flexibility along the way. I finish with some discussion and conclusions as to when FIBO can bring value to fintech KG projects.},
}

Section https://rawgit2.com/VladimirAlexiev/crunchbase-fibo/main/README.html#currencies discusses this problem.

VladimirAlexiev avatar Apr 02 '24 08:04 VladimirAlexiev

@VladimirAlexiev This is a really great example. Any chance we can update it to use the latest revisions to FIBO, including migration of some ontologies to the OMG's Commons Ontology Library 1.1 standard? I would be happy to walk through that with you :).

ElisaKendall avatar Apr 04 '24 18:04 ElisaKendall