rdflib.js icon indicating copy to clipboard operation
rdflib.js copied to clipboard

Add option to stop making prefixes up

Open fer22f opened this issue 5 years ago • 7 comments

I think it actually should stop being the default, reading and serializing files back isn't currently consistent, you can end up with something very different from what you started with, and the prefix generator is a bit arbitrary... it would be nice to be able to just turn it off.

fer22f avatar Aug 02 '18 02:08 fer22f

You mean you want it to rememeber the prefixes it parses, and keep them as suggestions for the serializer? Or you want to just not use prefixes at all? I think both would be useful. Remembering prefixes from the parse is valuable for many things, like making a small tweak to an existing file.

timbl avatar Aug 21 '18 14:08 timbl

I was referring to this functionality: https://github.com/linkeddata/rdflib.js/blob/2a08cfe753f71289ef5a1c7c804a8e29a9711c04/src/serializer.js#L582 If it finds something that doesn't have a prefix, it creates a new one out of thin air currently. It would be nice to have a finer grained control over this, so it only uses the prefixes I have specified previously.

Remembering prefixes from the parse is very valuable, it would just be nice if it didn't create others than the file already had.

fer22f avatar Aug 21 '18 14:08 fer22f

This issue https://github.com/rdfjs/representation-task-force/issues/116 links to a fork of rdf-data-model which adds prefix map to DataFactory. It also links to other issues in RDFJS draft which discuss handling prefixes.

elf-pavlik avatar Aug 21 '18 14:08 elf-pavlik

Ah, I can somewhat see what the issue is, then, this is not standardized. This library already stores the prefixes inside the model, via setPrefixForURI as a "hack".

My suggestion to add an option to stop making prefixes up is just so it works with the behavior the library already expresses.

When it reads a n3 document it stores into kb.namespaces, for example:

https://github.com/linkeddata/rdflib.js/blob/1460c90ae936be9ae62328f14010cf230ec3864d/src/n3parser.js#L490

And when it serializes, namespaces go into the serializer:

https://github.com/linkeddata/rdflib.js/blob/39df3ee297d4b365d195048817962b13d431bde6/src/serialize.js#L21

The makeUpPrefix functionality is not needed in this regard, it already works.

fer22f avatar Aug 22 '18 14:08 fer22f

I would like to add that, while working on a simple WebID implementation, this feature has been really confusing. One would expect a given input to produce, after parsing, the same Turtle output, whereas actually the library makes up weird, unused prefixes.

rkaw92 avatar Dec 20 '18 22:12 rkaw92

I just figured I'll note how we've been doing things like this over in the PerlRDF community:

There's a module URI::NamespaceMap that holds the actual prefix-namespace mapping. It has various methods and types to help with that, and it makes it easy to get a suitable URI-typed object for various uses. You can add your mappings to that module manually. It also has a guess method, which doesn't do any guessing on its own, but looks up in one or some of 4 modules written by different authors, by this order:

  • RDF::NS::Curated, that I maintain with just a carefully selected list of common prefix-namespace pairs, some of which are validated with tests.
  • XML::CommonNS, which was written by the XML project back in the day and isn't really maintained anymore, but it works.
  • RDF::NS, which is updated now and then with the highest voted mappings from prefix.cc. There's a lot of those.
  • RDF::Prefixes which guesses the prefix by looking at the URI and do clever tricks. It does a fairly good job, actually, but is normally used only as a fallback.

The URI::NamespaceMap object is passed to the serializer, so the serializer doesn't do any guessing on its own, that all happens based on how the system is configured to use the above modules. This creates a very predictable prefix-namespace mapping, and you can have a lot of sensible mappings without doing much work yourself.

kjetilk avatar Dec 20 '18 22:12 kjetilk

I would like to add that, while working on a simple WebID implementation, this feature has been really confusing. One would expect a given input to produce, after parsing, the same Turtle output, whereas actually the library makes up weird, unused prefixes.

True, which is addressed in #251

ManuelTS avatar Jan 19 '22 13:01 ManuelTS