Skosmos icon indicating copy to clipboard operation
Skosmos copied to clipboard

Combine default graph and optional graphs via UNION to return all the types available in the dataset

Open kinow opened this issue 2 years ago • 9 comments

Reasons for creating this PR

Only the types of the default graph are returned in the REST API, unless the user enables the option in Jena that includes all graphs.

Link to relevant issue(s), if any

  • Closes #678

Description of the changes in this PR

In the parts of the query where it references types (like rdfs label, or a subclass, etc) I have added a sibling UNION that includes the GRAPH ?g { same_statement }, to simulate what Jena does with that setting.

With the changes in this branch, after clearing the Local Storage in my browser, I can see all the types in both the REST response, and also in the UI (e.g. search auto-complete, see #1323 ), without needing to enable that option to merge the default graph in Jena Fuseki.

Known problems or uncertainties in this PR

Does it need a test? Does it sound like a valid solution? I thought about doing a simpler { { query_as_is } UNION { GRAPH ?g { query _as is } } } but since the query wasn't very long I opted for this current approach.

Checklist

  • [ ] phpUnit tests pass locally with my changes
  • [ ] I have added tests that prove my fix is effective or that my feature works (if not, explain why below)
  • [ ] The PR doesn't introduce unintended code changes (e.g. empty lines or useless reindentation)

kinow avatar Jun 01 '22 22:06 kinow

Codecov Report

Merging #1327 (5d375f4) into master (5d193c2) will increase coverage by 0.00%. The diff coverage is 100.00%.

@@            Coverage Diff            @@
##             master    #1327   +/-   ##
=========================================
  Coverage     70.68%   70.68%           
- Complexity     1646     1647    +1     
=========================================
  Files            32       32           
  Lines          3786     3787    +1     
=========================================
+ Hits           2676     2677    +1     
  Misses         1110     1110           
Impacted Files Coverage Δ
model/Model.php 80.81% <100.00%> (ø)
model/sparql/GenericSparql.php 91.28% <100.00%> (+0.01%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 5d193c2...5d375f4. Read the comment docs.

codecov[bot] avatar Jun 01 '22 22:06 codecov[bot]

Thanks for giving this a shot @kinow , this has been a longstanding issue that has bitten many users (including yourself)!

However, I'm not sure this is the right approach, adding lots of UNION clauses to the query which seems overcomplicated to me. The intent of the types query is simply to list all the concept types across all the vocabularies. The result should be similar, if not identical, to the result of performing a types query separately for each vocabulary and merging them into one list.

There are at least a few ways of doing that that I can think of:

  1. Relying on the union default graph (current status quo). This is really a hack and should be avoided. For one, it relies on the Fuseki setting. Second, this isn't really accurate as the dataset may include other graphs that are not configured as Skosmos vocabularies but may still contain concept type definitions that will end up in the results.
  2. Adding a single GRAPH ?g { block around the whole query. I think that this should be equivalent to the above, at least in sensible scenarios (each graph contains both the type definition and its label). It's still a bit problematic since types from graphs that are not configured as vocabularies may be included.
  3. Same as 2, but selecting only those graphs that are configured as vocabularies and using VALUES ?g to enumerate those, thus excluding irrelevant graphs.
  4. Same idea as 3, but using FROM NAMED clauses for selecting individual graphs instead of GRAPH ?g with VALUES ?g.

Would you like to try some of the approaches 2-4? I think any of them would be an improvement over the current situation, as long as the query still performs well.

osma avatar Jun 02 '22 09:06 osma

Would you like to try some of the approaches 2-4? I think any of them would be an improvement over the current situation, as long as the query still performs well.

Sounds like fun! I will give it a try later after or in between the other JQuery/Bootstrap work. Will be a good way to brush up SPARQL and Jena Fuseki :)

Do you have any suggestion on which item to try first (or to start thinking about), between options 2, 3, and 4?

Thanks!

kinow avatar Jun 02 '22 10:06 kinow

I think 3 or 4 would be most preferable as they don't suffer from the problem of irrelevant data being picked up from other possible named graphs in the dataset.

osma avatar Jun 02 '22 10:06 osma

I decided to try FROM NAMED clauses since we already had the code for that. We actually called the function for that creating the $fcl variable, but without passing any vocabularies, so it wasn't returning anything.

Here's an example of what the query may look like. It is from my development environment with a few vocabularies configured.

PREFIX  skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX  isothes: <http://purl.org/iso25964/skos-thes#>
PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX  finaf: <http://urn.fi/URN:NBN:fi:au:finaf:>

SELECT DISTINCT  ?type ?label ?superclass
FROM NAMED <http://skos.um.es/unescothes/>
FROM NAMED <http://zbw.eu/stw/>
FROM NAMED <http://vocabs.rossio.fcsh.unl.pt/tesauro/>
FROM NAMED <http://www.yso.fi/onto/yso/>
FROM NAMED finaf:
WHERE
  { GRAPH ?g
      { {   { BIND(skos:Concept AS ?type) }
          UNION
            { BIND(skos:Collection AS ?type) }
          UNION
            { BIND(isothes:ConceptGroup AS ?type) }
          UNION
            { BIND(isothes:ThesaurusArray AS ?type) }
          UNION
            { ?type rdfs:subClassOf/(rdfs:subClassOf)* skos:Concept }
          UNION
            { ?type rdfs:subClassOf/(rdfs:subClassOf)* skos:Collection }
        }
        OPTIONAL
          { ?type  rdfs:label  ?label
            FILTER langMatches(lang(?label), "en")
          }
        OPTIONAL
          { ?type  rdfs:subClassOf  ?superclass }
        FILTER EXISTS { ?s  a               ?type ;
                            skos:prefLabel  ?prefLabel
                      }
      }
  }

image

And the REST API response:

{
  "@context": {
    "skos": "http:\\/\\/www.w3.org\\/2004\\/02\\/skos\\/core#",
    "uri": "@id",
    "type": "@type",
    "rdfs": "http:\\/\\/www.w3.org\\/2000\\/01\\/rdf-schema#",
    "onki": "http:\\/\\/schema.onki.fi\\/onki#",
    "label": "rdfs:label",
    "superclass": {
      "@id": "rdfs:subClassOf",
      "@type": "@id"
    },
    "types": "onki:hasType",
    "@language": "en",
    "@base": "http:\\/\\/localhost:9090\\/rest\\/v1\\/"
  },
  "uri": "",
  "types": [
    {
      "uri": "http:\\/\\/www.w3.org\\/2004\\/02\\/skos\\/core#Concept",
      "label": "Concept"
    },
    {
      "uri": "http:\\/\\/zbw.eu\\/namespaces\\/zbw-extensions\\/Descriptor",
      "label": "Descriptor",
      "superclass": "http:\\/\\/www.w3.org\\/2004\\/02\\/skos\\/core#Concept"
    },
    {
      "uri": "http:\\/\\/zbw.eu\\/namespaces\\/zbw-extensions\\/Thsys",
      "label": "Thsys",
      "superclass": "http:\\/\\/www.w3.org\\/2004\\/02\\/skos\\/core#Concept"
    },
    {
      "uri": "http:\\/\\/www.w3.org\\/2004\\/02\\/skos\\/core#Collection",
      "label": "Collection"
    },
    {
      "uri": "http:\\/\\/purl.org\\/iso25964\\/skos-thes#ConceptGroup",
      "label": "Concept group"
    },
    {
      "uri": "http:\\/\\/purl.org\\/iso25964\\/skos-thes#ThesaurusArray",
      "label": "Array of sibling concepts"
    },
    {
      "uri": "http:\\/\\/www.yso.fi\\/onto\\/yso-meta\\/Concept",
      "label": "General concept",
      "superclass": "http:\\/\\/www.w3.org\\/2004\\/02\\/skos\\/core#Concept"
    },
    {
      "uri": "http:\\/\\/www.yso.fi\\/onto\\/yso-meta\\/Individual",
      "label": "Individual concept",
      "superclass": "http:\\/\\/www.w3.org\\/2004\\/02\\/skos\\/core#Concept"
    },
    {
      "uri": "http:\\/\\/www.yso.fi\\/onto\\/yso-meta\\/Hierarchy",
      "label": "Hierarchical concept",
      "superclass": "http:\\/\\/www.w3.org\\/2004\\/02\\/skos\\/core#Concept"
    }
  ]
}

kinow avatar Jun 05 '22 03:06 kinow

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot E 1 Security Hotspot
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

sonarqubecloud[bot] avatar Jun 05 '22 04:06 sonarqubecloud[bot]

The SonarCloud security issue reported is for a prefix URL, so that shouldn't be a blocker, I think. Ready for review again :+1:

kinow avatar Jun 05 '22 04:06 kinow

I can try to finish this off now that @kinow you are busy with the qtip-to-CSS PR #1324 ...

osma avatar Sep 13 '22 12:09 osma

I can try to finish this off now that @kinow you are busy with the qtip-to-CSS PR #1324 ...

That'd be great, please :D Thank you @osma!

kinow avatar Sep 13 '22 19:09 kinow