Skosmos icon indicating copy to clipboard operation
Skosmos copied to clipboard

Feature request: include configuration from URL

Open nichtich opened this issue 2 years ago • 6 comments

Description of the enhancement

Support reading content of config.ttl from a SPARQL endpoint. This would require to add two new properties in config.ttl, such as skosmos:configEndpoint and skosmos:configGraph to config.ttl so additional triples are taken from referenced SPARQL endpoint

Configuration is loaded in model/GlobalConfig.php and cached for performance reasons. I think the end of initializeConfig may be a good place to add loading of additional configuration via SPARQL. To avoid doing a SPARQL request each time, the result should be cached as well. I'm not sure on cache invalidation. Maybe only reload on request with a special query parameter e.g. ?config=reload.

Who are the users that would benefit from the enhancement and how?

User who prefer to manage their configuration in Fuseki database.

What new functionalities would the enhancement make possible?

In particular this would allow to manage vocabularies in one place (Fuseki).

Why is the enhancement important?

Adition of vocabularies requires to modify both config.ttl and import into Fuseki. This feature woul allow to manage it in one place and make sure the configuration is valid RDF (no invalid Turtle syntax).

nichtich avatar Feb 15 '23 07:02 nichtich

This is very similar to PR #1252 (loading configuration from a URL), which unfortunately was never finished. I think if we had support for just loading the configuration from a simple URL, it would be possible to use that facility to point to a graph in a Fuseki database (e.g. via the HTTP Graph Store protocol), right?

Caching and cache invalidation are tricky questions here, as they are in Computer Science generally.

osma avatar Feb 17 '23 09:02 osma

I assume that both loading from and URL loading from a Triple store are too costly to do it without caching: even when the response is cached it must be parsed. By now the configuration is only reloaded when modification time of config.ttl changed. How abbout additional forced reload via URL parameter? Or provide an easy way to flush the whole APC cache?

I'd keep config.ttl as main source of configuration and optionally point to included sources (URL or graph stored in Fuseki) from there. This would also allow to separate categories, vocabularies, and main configuration.

nichtich avatar Feb 17 '23 10:02 nichtich

The APC cache contains many things unrelated to configuration, in particular cached responses from external Linked Data sources. I don't think flushing the whole cache makes sense here, just the configuration part.

Forcing reloads with a URL parameter doesn't seem like a particularly RESTful way of doing things, and it could also be problematic if there is a reverse proxy such as nginx or Varnish in front of the Apache server running PHP and Skosmos. In addition, it could create conditions for a DoS attack, unless protected by some kind of access control.

If the main configuration is still in config.ttl as you suggest, would it be enough to use the same mechanism as currently, i.e. touch config.ttl (updating the timestamp of the file) would force a reload of all configuration, also the parts loaded from an external URL / endpoint? Of course this cannot be done from the web side directly. But it would be fairly easy for a sysadmin to set up a separate CGI or PHP script, outside Skosmos code, that simply performs touch config.ttl when accessed via URL.

osma avatar Feb 17 '23 12:02 osma

You're right, I like the idea to trigger config reloading via touch config.ttl!

My current workaround is to collect configuration from multiple sources (core, categories and vocabularies) into config.ttl but I'd prefer to use a fixed config.ttl with core configuration and manage vocabularies (vocabulary data and their configuration) in Fuseki.

nichtich avatar Feb 20 '23 11:02 nichtich

it would be possible to use that facility to point to a graph in a Fuseki database (e.g. via the HTTP Graph Store protocol), right?

Yes, I tried out with Fuseki: a plain URL such as http://localhost:3030/skosmos/get?graph=default is enough, so basically this issue covers #1252. In contrast to the implementation there, I'd propose to extend configuration with a property skosmos:includeConfig to load configuration from an URL or from another file. Recursive inclusion is not needed and as discussed, changes in included configuration are not detected automatically but config.ttl must be touched to trigger reloading.

By the way this would also help to use same main configuration for multiple instances (dev, prod, test...) and include differing settings such as skosmos:baseHref and skosmos:logBrowserConsole via skosmos:includeConfig "config-local.ttl" from a local file.

nichtich avatar Feb 21 '23 07:02 nichtich

Implementation here. No test coverage so far but "works on my machine". I'll do a polished PR if the functionality is agreed upon.

nichtich avatar Feb 22 '23 19:02 nichtich