Skosmos icon indicating copy to clipboard operation
Skosmos copied to clipboard

Load config from URL

Open pulquero opened this issue 2 years ago • 2 comments

Reasons for creating this PR

Enable the creation of a reusable (deployment-agnostic) docker container.

Link to relevant issue(s), if any

  • Closes #

Description of the changes in this PR

  1. Make it easier to manage multiple configs without having to rename files back-and-forth with the introduction of a SKOSMOS_CONFIG environment variable.
  2. Externalise config from the docker container by allowing URLs (incl sparql query URLs) as well as filenames.

Known problems or uncertainties in this PR

Checklist

  • [ ] phpUnit tests pass locally with my changes
  • [ ] I have added tests that prove my fix is effective or that my feature works (if not, explain why below) As there are a large number of outstanding PRs, I'm loath to spend time beyond my immediate needs on stuff that may just end up going nowhere.
  • [x] The PR doesn't introduce unintended code changes (e.g. empty lines or useless reindentation)

pulquero avatar Dec 01 '21 15:12 pulquero

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

sonarcloud[bot] avatar Dec 01 '21 16:12 sonarcloud[bot]

Thanks for the PR and sorry for the slow response. This is a good feature (both introducing the environment variable and allowing URLs to be used instead of filenames) and IIRC something like this has been requested by other users as well in the past although I forget the details.

I'm worried about the performance. Parsing the config file can be a very expensive operation, tens or hundreds of milliseconds, and if there is no caching on the Skosmos side, then both URL retrieval and parsing has to be done each time a page is being generated - sometimes several times when the page performs AJAX style queries or accesses the REST API, for example on the vocabulary home page. The cache seems to be bypassed now in case a URL is given. Although this is not a regression in a strict sense (previously it wasn't possible to use a config URL at all, now it's possible but potentially very slow), I don't think this is an ideal solution.

I can see two possible improvements:

  1. Use the timestamp (Last-Modified probably?) from the HTTP response to implement smarter caching. Or even better, implement full HTTP caching semantics (ETag, If-Modified-Since, Cache-Control etc.). Only re-parse when the configuration on the HTTP server has actually changed. But this could still involve lots of HTTP requests (depending in part on how the HTTP server is configured), even if it saves on parsing time. And it would be very difficult to implement from scratch, but I'm sure there are existing HTTP client libraries (Guzzle, PHP-HTTP...) that could do this if we decide to add a dependency.
  2. Add a simple timeout for the configurations loaded using HTTP. For example, the cached configuration would be valid up to 5 minutes, then it would be fetched and parsed again.

I'd go for 2, since it's a lot simpler. It would still be possible to implement option 1 later, if necessary.

osma avatar Mar 02 '22 13:03 osma