SPARQL.js icon indicating copy to clipboard operation
SPARQL.js copied to clipboard

Add Option to Keep Query Prefixes

Open vitalis-wiens opened this issue 10 months ago • 9 comments

This PR extends the options paramaters of the SparqlGenerator.

The newly introduced option parameter keepQueryPrefixes allows to combine the used prefixes of the query body and the once mentioned in the Prefix defintions.

The use-case is the following:

Some database vendors can provide hints to the database using Prefixes, however the current available options (used Prefixes or all Prefixes) either ignore the optimization prefixes or add all available prefixes which results in a very large prefix definitions and more challenging debuging of queries.

Related Issue: https://github.com/RubenVerborgh/SPARQL.js/issues/186

vitalis-wiens avatar Mar 07 '25 09:03 vitalis-wiens

Hi @RubenVerborgh, metaphacts contributed this fix and we would highly appreciate if it would be included in and distributed through a new sparql.js release. Do you have any plans/schedule for the next release or other dependencies? Just asking for planning purpose. Thanks

vitalis-wiens avatar Mar 10 '25 12:03 vitalis-wiens

Hi @RubenVerborgh

thanks for the quick response and the additional perspective. I can follow your reasoning of allowing the user to provide a list of "fixedPrefixes" that are always added to the query (independent of their usages inside the query). Whether this is added by overloading the allPrefixes setting, or by introducing a new dedicated option fixedPrefixes is for me a bit matter of taste. For clarity from an interface perspective I personally would prefer the additional fixedPrefixes option.

Our original use-case / improvement request is however slightly different from "fixedPrefixes":

we want to configure the generator to explicitly keep all those prefixes in the query prologue that are defined by the user (i.e., in the original query string). The reason for that in our case is that some database vendors use specialized prefixes as a kind of query hint.

In addition (what we are already able to do now): we want to inject a fixed list of prefixes, but from those only keep the ones that are used in the query. Note that this is already technically possible now in sparqljs.

Considering this: what do you think of the following abstract implementation sketch on configuration level

GeneratorSettings {
    // whether to render all prefixes known to the context of the query (independent of the usage)
    allPrefixes: true | false

    // list of additional prefixes to always add (independent of usage)
    fixedPrefixes: List<Prefix> 
 
    // keep all prefixes of the query prologue as specified by the user in the original query string (independent of the usage)
    keepQueryPrefixes: true | false
}

@vitalis-wiens would be able to do an iteration on the PR if this proposal makes sense to you

aschwarte10 avatar Mar 11 '25 10:03 aschwarte10

Considering this: what do you think of the following abstract implementation sketch on configuration level

I don't understand the interaction between the three options? Are they fully independent?

In particular, the generation has no notion about original query string.

RubenVerborgh avatar Mar 11 '25 15:03 RubenVerborgh

Hi @RubenVerborgh I have prepared a second commit with the suggested implementation for a new options parameter fixedPrefixes

I will try to summarize the different use-cases, expected behavior and interactions between the options

  • With the current PR we would have three options to control the generation of the prefix block of the query string
      1. allPrefixes: boolean
      1. keepQueryPrefixes: booean
      1. fixedPrefixes: object ( similar to prefix list that is used in the Parser )

Possible Combinations of options parameters Definitions:

  • All Prefixes : prefixes defined by the query and the parser
  • Used Prefixes : subset from All Prefixes where only the used once are written to the query string
  • Fixed Prefixes : subset from All Prefixes
  • Query Prefixes : Prefixes defined in the query
keepQueryPrefixes fixedPrefixes allPrefixes Results
false null false Used Prefixes
false null true All Prefixes
false listOfFixedPrefixes false Used Prefixes + Fixed Prefixes (without duplicates)
false listOfFixedPrefixes true All Prefixes
true null false Used Prefixes + Query Prefixes (without duplicates)
true null true All Prefixes
true listOfFixedPrefixes false Used Prefixes + Fixed Prefixes + Query Prefixes (without duplicates)
true listOfFixedPrefixes true All Prefixes

In our use case we would like to go for the version where we have used prefixes and the ones defined in the query.

Form debugging and experimenting this are maybe relevant pointers:

q = parser.parse(queryString) parses the query string

q.prefixes is the list of query defined prefixes but q has also a prototype where the prefixes from the parser are added.

Note: In the current implementation I assume that the fixed prefixes is a subset of all prefixes This means if neither the parser configuration nor the query provide a prefix which is mentioned in the fixedPrefixes it will not be exported. If we want to support the scenario where the fixed prefix list does not have an intersection with All Prefixes I could extend the PR once more :)

I hope this explanation is helpfull to find a path forward

On a side note: For the next release could you also consider the following PR https://github.com/RubenVerborgh/SPARQL.js/pull/184 ?

vitalis-wiens avatar Mar 11 '25 17:03 vitalis-wiens

@vitalis-wiens Thanks for the detailed explanation!

Regarding the table, unless I'm missing something, we have 8 possible input combinations, but only 4 possible output combinations (and I actually think that's 3, because used + fixed + query prefixes sounds like simply “all” to me). That's why I proposed the more simple configuration above.

allPrefixes Generated prefixes
false The query.prefixes that occur in the query body
true Everything from `query.prefixes
Array/Object The query.prefixes that occur in the query body OR are listed in the Array

The misunderstanding might stem from the fact that all of your tests (and perhaps your use case) start from SparqlParser. But that's not the common case. You can manipulate the input to generator.stringify.

I think this simple extension of the allPrefixes flag allows you to do everything you need, without the complexity of 3 flags that are not orthogonal.

RubenVerborgh avatar Mar 11 '25 19:03 RubenVerborgh

Hi @RubenVerborgh, thanks for the explanation and the pointer about use case

I have now revised the PR:

  • it extends the allPrefixes with an opbject that should hold the prefixes
  • in the funtion baseAndPrefixes we now have an object fixedPrefixes which is extracted from the options

I have also revised the tests

  • the first test just provides a static prefix list to the generator options
  • the second test provides a dynamic prefix list which is extracted from the query after it has been parsed --> this show how our use-case can be realized with this option

vitalis-wiens avatar Mar 12 '25 11:03 vitalis-wiens

Hi @RubenVerborgh , did you have a chance to review the latest iteration / refinement from Vitalis?

We would ideally want to integrate the new version of sparqljs with an improvement for this issue in our upcoming release, where we are aiming for end of March.

Any feedback highly appreciated allowing us to align the plans

aschwarte10 avatar Mar 17 '25 07:03 aschwarte10

Thanks; will get to this ASAP; I'm a volunteer on this and it's teaching semester.

RubenVerborgh avatar Mar 17 '25 11:03 RubenVerborgh

Hi @RubenVerborgh did you have a chance to revisit the change proposed in this PR? Is there any chance to see it made available in a new sparqljs release? Thanks!

aschwarte10 avatar May 28 '25 05:05 aschwarte10