API Documentation caching
I'm looking into ways for improving hydra APIs by using cache headers. A first-order recommendation is to use versioned assets and long-lived immutable cache. I think this fits the most common case of API Documentation which remains static at least until the server app is restarted.
To implement this behaviour would require three changes:
First, to add a query string to the documentation header. Possible something like UNIX timestamp
-link: </api>; rel="http://www.w3.org/ns/hydra/core#apiDocumentation"
+link: </api?v=123456789>; rel="http://www.w3.org/ns/hydra/core#apiDocumentation"
Second, cache-control to the API Documentation itself
Cache-Control: max-age=31536000, immutable
Lastly, to actually serve the triples with all URIs /api rewritten to /api?v=123456789 so that client can correctly find it in the representation.
This should allow proxies to cache the API documentation.
Here is the diff that solved my problem:
diff --git a/node_modules/hydra-box/lib/middleware/apiHeader.js b/node_modules/hydra-box/lib/middleware/apiHeader.js
index 8751fbc..e9b799c 100644
--- a/node_modules/hydra-box/lib/middleware/apiHeader.js
+++ b/node_modules/hydra-box/lib/middleware/apiHeader.js
@@ -1,16 +1,29 @@
const { Router } = require('express')
+const $rdf = require('rdf-ext')
+
+const timestamp = Date.now()
function factory (api) {
const router = new Router()
+ const timeDependentApiId = $rdf.namedNode(`${api.term.value}?v=${timestamp}`)
+ const dataset = api.dataset.map(({ subject, predicate, object, graph }) => {
+ return $rdf.quad(
+ subject.equals(api.term) ? timeDependentApiId : subject,
+ predicate,
+ object.equals(api.term) ? timeDependentApiId : object,
+ graph)
+ })
+
router.use((req, res, next) => {
- res.setLink(api.term.value, 'http://www.w3.org/ns/hydra/core#apiDocumentation')
+ res.setLink(timeDependentApiId, 'http://www.w3.org/ns/hydra/core#apiDocumentation')
next()
})
router.get(api.path, (req, res, next) => {
- res.dataset(api.dataset).catch(next)
+ res.setHeader('cache-control', 'max-age=31536000, immutable')
+ res.dataset(dataset).catch(next)
})
return router
This issue body was partially generated by patch-package.
Having experimented with this approach a little I had limited success. The problem with a query string is that this is identified as a different identifier which caused me trouble on the client trying to find the documentation resource.
A different approach I tried was with a shorter cache age and etag. This appears to work nicely
diff --git a/node_modules/hydra-box/lib/middleware/apiHeader.js b/node_modules/hydra-box/lib/middleware/apiHeader.js
index 8751fbc..33546b7 100644
--- a/node_modules/hydra-box/lib/middleware/apiHeader.js
+++ b/node_modules/hydra-box/lib/middleware/apiHeader.js
@@ -1,15 +1,32 @@
const { Router } = require('express')
+const $rdf = require('rdf-ext')
+const etag = require('etag')
+const toCanonical = require('rdf-dataset-ext/toCanonical.js')
+const preconditions = require('express-preconditions')
function factory (api) {
const router = new Router()
+ const apiEtag = etag(toCanonical(api.dataset))
+
router.use((req, res, next) => {
res.setLink(api.term.value, 'http://www.w3.org/ns/hydra/core#apiDocumentation')
next()
})
- router.get(api.path, (req, res, next) => {
+ router.get(api.path,
+ preconditions({
+ async stateAsync() {
+ return {
+ etag: apiEtag
+ }
+ }
+ }),
+ (req, res, next) => {
+
+ res.setHeader('cache-control', 'max-age=30, stale-while-revalidate=30')
+ res.setHeader('etag', apiEtag)
res.dataset(api.dataset).catch(next)
})
There is no one way to set caching, and APIs may choose not to completely. I was thinking that maybe hydra-box could introduce extension points to plug middleware before the get(api.path) handler? Something like
-function factory (api) {
+function factory (api, ...beforeApi) {
- router.get(api.path, (req, res, next) => {
+ router.get(api.path, ...beforeApi, (req, res, next) => {
res.dataset(api.dataset).catch(next)
})
}
For the configuration above, I would provide the preconditions middleware and a second, to set the cache-control and etag headers to my liking