sparqlwrapper
sparqlwrapper copied to clipboard
Design considerations towards 2.x
It's not the right time, but just because the discussion at issue #36, I'd like to share this idea/methodology I had with the whole community.
The current SPARQLWrapper 1.x is pretty old, originally designed in 2007, and maintained since that with important evolution at different levels (SPARQL 1.1 Protocol, RDFLib 4.x, Pythion 3.x. So at some point we should start to think about a major version 2.x with a renewed API. And I'd like to follow a community-driven process that could satisfy everybody. That means not only the authors (@iherman, @dayures, @indeyets and myself) would push their ideas, but everybody who uses the library should feed this process.
Currently there is not date for such milestone. This is just the starting point...
I'll just use a moment to remind everyone what 2.x means in terms of semver: Major version is incremented "when you make incompatible API changes".
So, we can safely introduce new classes in 1.x releases as long as old APIs still work, but 2.x is a chance to throw away old cruft
Exactly that's the point: design without thinking on backwards compatibility.
would be cool if we got rdflib and sparqlwrapper to properly use six instead of our own py2,3 compat hacks: RDFLib/rdflib#374
(Just wrote this comment for #51 but @wikier noted it's more appropriate here.)
I don't have deep insight into how SparqlWrapper is being used in applications, but I wonder if mechanisms like these should be pluggable and interchangeable? By which I mean that SparqlWrapper (and thus RDFLib) shouldn't bundle a lot of choices by default, but support opt-ins by using a pluggable API.
It could even provide sensible defaults for "advanced" behaviour by doing try-imports and declaring optional dependencies. (But avoid opinionated choices.)
Or perhaps better, work as a component in an application responsible for optimizing its HTTP handling (and choice of e.g. keep-alive, HTTP/2...).
as brought up in other issues:
- base on requests lib (#51, #52)
- introduce optional dependencies for things like keepalive (#51, #63)
though I haven't yet used the lib so far, I'd say that Python 2 support should be dropped. Also the stdlib's json module is fine today, no need to rely on simplejson.
maybe an optional local cache would be of benefit for some applications. Myself finds magic Python representations of triples an interesting thing to use and to implement, like surfrdf.
Addendum, as someone who might want to contribute, I'd say that tox and pytest are imperative. As someone who wants to read the docs, Sphinx generates better docs imo.
idea: In order to compare properly the result format, the requested QueryType (SPARQL Query Form) is needed. For instance, the unexpected N3 requested for a SELECT would return XML
https://github.com/RDFLib/sparqlwrapper/blob/b0465ddaa92ccc8104346bbcf0427d4be556d370/SPARQLWrapper/Wrapper.py#L841
Think to move to requests module (instead of urllib)
Think about reshaping some constants like SPARQLWrapper.RDF (see #27)
Already migrated to sphinx and readthedocs in version 1.8.5
Is there any progress in moving towards a version 2.x?
Also I'm unclear about the relation between sparqlwrapper and rdflib, there is the dependency in the requirements.txt (https://github.com/RDFLib/sparqlwrapper/blob/master/requirements.txt) but I couldn't see it imported in any of the files in the SPARQLWrapper module (https://github.com/RDFLib/sparqlwrapper/tree/master/SPARQLWrapper).
Would it make sense to use the SPARQLConnector or SPARQLStore from rdflib?
(This is also related to https://github.com/RDFLib/rdflib/issues/1264)
I think there was some discussion or clarification about the relation of sparqlwrapper to rdflib somewhere, maybe @nicholascar knows more about that, but I couldn't find it.
The current state of affairs having a 2.0.0 release out there which breaks things but having open issues and PRs ist IMHO sub par and I'd love to help improve the situation. Who are the current committers?