rdflib.js
rdflib.js copied to clipboard
Add option to stop making prefixes up
I think it actually should stop being the default, reading and serializing files back isn't currently consistent, you can end up with something very different from what you started with, and the prefix generator is a bit arbitrary... it would be nice to be able to just turn it off.
You mean you want it to rememeber the prefixes it parses, and keep them as suggestions for the serializer? Or you want to just not use prefixes at all? I think both would be useful. Remembering prefixes from the parse is valuable for many things, like making a small tweak to an existing file.
I was referring to this functionality: https://github.com/linkeddata/rdflib.js/blob/2a08cfe753f71289ef5a1c7c804a8e29a9711c04/src/serializer.js#L582 If it finds something that doesn't have a prefix, it creates a new one out of thin air currently. It would be nice to have a finer grained control over this, so it only uses the prefixes I have specified previously.
Remembering prefixes from the parse is very valuable, it would just be nice if it didn't create others than the file already had.
This issue https://github.com/rdfjs/representation-task-force/issues/116 links to a fork of rdf-data-model
which adds prefix map to DataFactory
. It also links to other issues in RDFJS draft which discuss handling prefixes.
Ah, I can somewhat see what the issue is, then, this is not standardized. This library already stores the prefixes inside the model, via setPrefixForURI
as a "hack".
My suggestion to add an option to stop making prefixes up is just so it works with the behavior the library already expresses.
When it reads a n3 document it stores into kb.namespaces
, for example:
https://github.com/linkeddata/rdflib.js/blob/1460c90ae936be9ae62328f14010cf230ec3864d/src/n3parser.js#L490
And when it serializes, namespaces
go into the serializer:
https://github.com/linkeddata/rdflib.js/blob/39df3ee297d4b365d195048817962b13d431bde6/src/serialize.js#L21
The makeUpPrefix
functionality is not needed in this regard, it already works.
I would like to add that, while working on a simple WebID implementation, this feature has been really confusing. One would expect a given input to produce, after parsing, the same Turtle output, whereas actually the library makes up weird, unused prefixes.
I just figured I'll note how we've been doing things like this over in the PerlRDF community:
There's a module URI::NamespaceMap that holds the actual prefix-namespace mapping. It has various methods and types to help with that, and it makes it easy to get a suitable URI-typed object for various uses. You can add your mappings to that module manually. It also has a guess
method, which doesn't do any guessing on its own, but looks up in one or some of 4 modules written by different authors, by this order:
- RDF::NS::Curated, that I maintain with just a carefully selected list of common prefix-namespace pairs, some of which are validated with tests.
- XML::CommonNS, which was written by the XML project back in the day and isn't really maintained anymore, but it works.
- RDF::NS, which is updated now and then with the highest voted mappings from prefix.cc. There's a lot of those.
- RDF::Prefixes which guesses the prefix by looking at the URI and do clever tricks. It does a fairly good job, actually, but is normally used only as a fallback.
The URI::NamespaceMap object is passed to the serializer, so the serializer doesn't do any guessing on its own, that all happens based on how the system is configured to use the above modules. This creates a very predictable prefix-namespace mapping, and you can have a lot of sensible mappings without doing much work yourself.
I would like to add that, while working on a simple WebID implementation, this feature has been really confusing. One would expect a given input to produce, after parsing, the same Turtle output, whereas actually the library makes up weird, unused prefixes.
True, which is addressed in #251