fibo
fibo copied to clipboard
sameAs links between currency-USD and currency-USDollar
(was: fibo-fnd-acc-4217:ISO4217-CodeSet individual URLs should use codes not names)
2022Q2 file FND/Accounting/ISO4217-CurrencyCodes.rdf has this:
fibo-fnd-acc-4217:USD
rdf:type fibo-fnd-acc-cur:CurrencyIdentifier , owl:NamedIndividual ;
lcc-lr:hasTag "USD" ;
rdfs:label "USD" ;
lcc-lr:denotes fibo-fnd-acc-4217:USDollar ;
lcc-lr:identifies fibo-fnd-acc-4217:USDollar ;
fibo-fnd-acc-4217:USDollar
rdf:type fibo-fnd-acc-cur:Currency , owl:NamedIndividual ;
lcc-lr:hasName "US Dollar" .
rdfs:label "US Dollar" ;
fibo-fnd-acc-cur:hasNumericCode "840" ;
lcc-cr:isUsedBy lcc-3166-1:VirginIslandsBritish ...
- Use Case:
- I am converting a dataset that uses currency codes (eg Crunchbase #1808) to RDF
- I want to reuse to the above individuals, rather than making my own
- Problem: I cannot compute (predict) the name
USDollar
from the codeUSD
-
fibo-fnd-acc-4217:USD
doesn't have that problem, but financial data relates toCurrency
notCurrencyIdentifier
- Most RDFization tools cannot do an RDF lookup while converting tabular data to RDF (eg RML, R2RML, TARQL...)
- (In contrast, Ontotext Refine can do an RDF lookup, see https://github.com/VladimirAlexiev/rdf2rml/blob/master/doc/rdf2sparql.pod#preprocessor-macros
CB_AGENT_URL()
and then https://github.com/VladimirAlexiev/rdf2rml/blob/master/doc/rdf2sparql.pod#generated-sparql)
-
- Suggestion: change the above URLs to
fibo-fnd-acc-4217:USD-code
andfibo-fnd-acc-4217:USD-currency
respectively
@VladimirAlexiev You should be able to use the inverse query to come up with the name of the currency from the code. The code identifies the currency. When we integrate the OMG Commons ontology library, identifies will be a subproperty of denotes and the redundancy between the two properties will be eliminated. If you use "isIdentifiedBy" in an inverse SPARQL query, though, it all works properly.
We have had other users at banks request the names of the currencies as we have them. We have used a pattern for the MIC codes where we have Exchange-code and MIC-code mainly because of (1) the volume of individuals and (2) the difficulty in differentiating the names using algorithms (many are duplicated in the data from ISO). But our policy is not to use codes as you suggest when the number of individuals is small and when the names are well-known and requested by members. The policy stems from the use of the ontology for vocabulary and business glossary purposes in addition to for IT purposes, if that helps.
Of course I can do it with a query, but most ETL tools cannot do a query (or even simple RDF pattern) while performing ETL.
If banks have requested URLs that use English phrases, that's a valid reason to keep it as it is. However, I think that a vocabulary or business glossary should use (and primarily display) rdfs:label, not the URL.
One option would be to add an optional ontology that uses a pattern for what we have called "adjunct" URLs for country codes for a similar purpose - with OWL sameAs to the primary one that uses the name. The adjuct ontology would not start with the code however, it would have Currency-
Currency-USD
and CurrencyCode-USD
are great URLs.
owl:sameAs
works perfectly in GraphDB due to https://graphdb.ontotext.com/documentation/10.0/sameas-optimisation.html.
Eg we used it in a recent semantization of NIH grants (cc @nataschake, @stefangn98; continuing example from https://github.com/w3c/sparql-12/issues/14) where a short and a long code is used for certain kinds of organizations:
<grant/123> a Grant;
funding <grant/123/funding/NIH>, <grant/123/funding/CDC>;
administeredBy <funder/CD>.
<grant/123/funding/NIH> a Funding; funder <funder/NIH>; amount 10000.
<grant/123/funding/CDC> a Funding; funder <funder/CDC>; amount 12000.
<funder/CDC> a Funder;
idAdministrator "CD"; # only "administeredBy" entities have this short code
idFunder "CDC"; # all funders have this long code
name "Centers for Disease Control".
<funder/CD> owl:sameAs <funder/CDC>.
However, owl:sameAs
doesn't work so smoothly in other repositories.
So if you include it, please use a separate file.
ISO3166-1-CountryCodes-Adjunct uses sameAs
(pointed in https://github.com/ga-group/iso10383/issues/6), so @mereolog can you do the same for FIBO currencies?
- change the URLs to look like @ElisaKendall suggested:
fibo-fnd-acc-4217:CurrencyCode-USD, fibo-fnd-acc-4217:Currency-USDollar
- run this to generate the separate coreferencing file, which will include
- eg
fibo-fnd-acc-4217:Currency-USDollar owl:sameAs fibo-fnd-acc-4217:Currency-USD
- eg
construct {
?curr owl:sameAs ?currAsCode
} where {
?code a fibo-fnd-acc-cur:CurrencyIdentifier; lcc-lr:hasTag ?c; lcc-lr:identifies ?curr.
bind(iri(concat(str(fibo-fnd-acc-4217:),"Currency-",?c)) as ?currAsCode)
}
@VladimirAlexiev @mereolog We will add a separate ontology that includes the adjuncts as we did for LCC, hopefully soon. The existing URLs will remain, since a number of banks are already using them, but a second ontology with these other codes will be added that includes the owl:sameAs references. I'll have to talk with Pawel (@mereolog) about the timing for scripting this and adding the ontology, but hopefully we can do this relatively soon.
@ElisaKendall can we close this?
@merelog - I would rather leave it open for now. We haven't done the work, but that doesn't mean that we shouldn't. It has not been an issue for our FIBO other users, but some folks that want to use currency codes for other applications who are more software oriented might want something similar.
For reference, I wrote this
@TechReport{Alexiev-Crunchbase-Fibo-2023,
author = {Vladimir Alexiev},
title = {{Exploring FIBO Complexity With Crunchbase: Representing Crunchbase IPOs in FIBO}},
month = sep,
year = 2023,
url = {https://rawgit2.com/VladimirAlexiev/crunchbase-fibo/main/README.html},
url_Github = {https://github.com/VladimirAlexiev/crunchbase-fibo/},
keywords = {fintech, Crunchbase, ontologies, semantic modeling, Initial Public Offering, IPO, Financial Industry Business Ontology, FIBO},
abstract = {The Financial Industry Business Ontology (FIBO) by the Enterprise Data Management Council (EDMC) is a family of ontologies and a reference model for representing data in the financial world using semantic technologies. It is used in fintech Knowledge Graph (KG) projects because it offers a comprehensive and principled approach to representing financial data, and a wide set of predefined models that can be used to implement data harmonization and financial data integration. The 2022Q2 FIBO release consists of 290 ontologies using 380 prefixes that cover topics such as legal entities, contracts, agency, trusts, regulators, securities, loans, derivatives, etc. FIBO's reach and flexible ontological approach allow the integration of a wide variety of financial data, but it comes at the price of more complex representation. Crunchbase (CB) is a well-known dataset by TechCrunch that includes companies, key people, funding rounds, acquisitions, Initial Public Offerings (IPOs), etc. It has about 2M companies with a good mix of established enterprises (including 47k public companies), mid-range companies and startups. We (Ontotext and other Wikidata contributors) have matched 72k CB companies to Wikidata, see this query. I explore the representation of Crunchbase data (more specifically IPOs) in FIBO and compare it to the simplest possible semantic representation. I therefore illustrate the complexity of FIBO, and explain its flexibility along the way. I finish with some discussion and conclusions as to when FIBO can bring value to fintech KG projects.},
}
Section https://rawgit2.com/VladimirAlexiev/crunchbase-fibo/main/README.html#currencies discusses this problem.
@VladimirAlexiev This is a really great example. Any chance we can update it to use the latest revisions to FIBO, including migration of some ontologies to the OMG's Commons Ontology Library 1.1 standard? I would be happy to walk through that with you :).