ember-service-worker-asset-cache
ember-service-worker-asset-cache copied to clipboard
Ignore query string for cache match
The query string is also taken into account when looking for a cache match. That means
My app wants to fetch the fontawesome font, from a URL like fonts/fontawesome-webfont.woff2?v=4.7.0 but it's not going to match the service worker's cache because the add-on expects an exact match on the request URL:
https://github.com/DockYard/ember-service-worker-asset-cache/blob/master/service-worker/index.js#L71-L85
At the same time, caches are stored by their file name, excluding the query string so there will be no match when looking up the cache (as an exact match is required).

This PR makes the match less strict by ignoring the query string when matching. I am aware that this might be too loose so I'm happy to make a change to this PR should it be needed.
I was having the exact same problem. I'm now consuming your ember-service-worker-asset-cache fork and it totally fixed it. Would love to get this into master. Ty for the PR @balinterdi
@ReasonableDeveloper Glad it got your problem fixed, too.
@eshtadc Any chance of merging this PR? Thank you.
It appears that the files are stored without query string because that's how they are declared in the config. So if you wanted to cache a specific version from the query string you'd be out of luck by removing it, right? Going to run some tests this morning. Maybe it would be better to make it configurable or to find a more sophisticated way to match like ignoring query strings in the match if there isn't a query string in the configured url?
A suggestion from discord:
is it normal that the files include the version specified in the query string? I would assume that the name is the same and what only changes is the query string (I could be wrong though). I think it would be great if this ignoreSearch option can be specified in ember-cli-build file. Perhaps having two arrays, one for include files considering query strings and another one for ignoring them would be more flexible, but I haven't had time to check the code to see how to implement this
@eshtadc Thank you for taking the time to think about this.
I like the suggestion from Discord a lot because I also think being able to decide whether to include the query string (QS) in the match or not should happen on a per-pattern basis.
A further advantage of the two-arrays approach is that it can be made backwards compatible. The current way of just having one include will keep taking the QS into account while the second one, includedWithoutQueryString (unless we find a better name :) ) will match without the QS going forward.
I'll try to find some time tomorrow to play with this and see what it gives.
@balinterdi the thing about the two arrays that could be problematic is to accidentally include the same assets two times in the cache.
What if we allowed optional additional configuration in the existing include and exclude? Right now it is an array of strings (possibly with globs)
include: [
'assets/admin-engine.js',
'fonts/font-awesome.*'
],
// which asset files to exclude, glob paths are allowed!
exclude: [
'**/*.gif'
],
We could still allow a string value, but if an object is passed it could be something like:
include: [
'assets/admin-engine.js',
{
path: 'fonts/font-awesome.*',
cacheQueryString: ['version', 'anotherParam'] // an array of parameter names to explicitly include in cache regardless of value
},
....
],
While we could also add a key for ignoreQueryString then we run into validating what happens if the user manages to both cache and ignore a value? I'm inclined to think it's more common that there is something that is explicitly part of the cache url rather than you want to cache all values except a particular one.
We'd probably also want an overall setting for whether to match on query parameters when there aren't more explicit values given. In the example above, that setting would apply to assets/admin-engine.js. I would advocate to default this to the current scenario to protect backwards compatibility - but maybe could be convinced.
Of course, with all this said, it will be a bit of gymnastics to deal with the ensuring the cache keys are set to the url with query parameters and their cached values and possibly alphabetizing the cache matching query parameters to ensure an exact key match... but that could be extracted into a helper pretty easily I would think.
@balinterdi @ReasonableDeveloper @HenryVonfire Thoughts?
After thinking about it, I really like your proposed API. :+1:
It leaves the ability to add additional keys/options in the future and I also like the idea of being as explicit as possible.
I think talking about the default behavior is important here. Those are the only three good defaults that I can think off:
- Ignore QS in the
cacheand strictly match thecachekey with theurl(I think this is the current behavior in case theurlof theincludesarray doesn't include any QS) - Ignore QS in the cache but loosely match
urlswith thecachekey (@balinterdi PR) - Each distinct
url(including its QS) has its owncachekey. This would require strictly matching theurlwith the path in the URL. This default would have a ton of caveats and probably shouldn't be used.
If I understand you correctly, the default would still be [1] but now we have the option to explicitly include the QS by specifying them in cacheQueryString. I'm assuming this would optionally enable loosely matching the urls [2] (or at least take the QS into account when finding a match) instead of creating a new key in the cache for each distinct url [3]. This would remove the burden of adding an additional ignoreQueryString key right?
If we still want to provide an option to always loosely match the url to the cache key, regardless of the QS, we could still provide a wildcard/regex/glob string value to your proposed cacheQueryString option.
Your example could look something like the following:
include: [
'assets/admin-engine.js',
{
path: 'fonts/font-awesome.*',
cacheQueryString: '*' // could also be a glob or regex according to DockYard API preferences
}
],
I think we're mostly on the same page here. Yes, I was proposing that we maintain the default behavior to match the current behavior (your [1]), but allow setting that default behavior to [2] with an additional setting. This would prevent us from changing the behavior for those already using it but make the change available. I suppose the other option is to make it a breaking change on the default and update versions accordingly.
As far as the glob idea to replace the default, I want people to be able to continue to use the addon with just an array of strings rather than having to explicitly set up the query string matching behavior for each item. If I removed the default ignoreQueryString (or whatever name) then the implementer would need to explicitly set up handling for each item they wanted loosely matched. (If I am understanding correctly, of course). I do like the idea of accepting globs, though, that you propose, although I wonder how useful it would be given that we're just talking about query string parameters. I'm not able to immediately think of some use cases where this would be particularly helpful. Do you have some in mind?
I was also having a hard time thinking how a regex or glob for query strings could be useful. Maybe in case you don't know the exact query string beforehand but I think query strings don't change that often so I probably wouldn't need it.
Then I'm all for keeping it simple. If a use case arises it makes sense to keep it in mind. I'm here at EmberConf and may or may not get to work on this over the next few days - but code is always welcome. I will be back in commission in a few days.
Sorry for delaying my feedback on this. I like where this is going but want to make sure I understand correctly.
Given the following configuration:
include: [
'assets/admin-engine.js',
{
path: 'fonts/font-awesome.*',
cacheQueryString: ['v'] // an array of parameter names to explicitly include in cache regardless of value
},
....
],
What happens if I a request is made to fonts/fontawesome-webfont.woff2?v=4.7.0 and then later another request is made to fonts/fontawesome-webfont.woff2?v=5.1.0? My understanding is that the response to the first request is cached and then served to the second request.
If that's correct, we might want to rename cacheQueryString to ignore(d)QueryParams because that's what it is – the cache manager just ignores the value (and even the presence!) of that query param. Even the response for fonts/fontawesome-webfont.woff2 will be cached and served to fonts/fontawesome-webfont.woff2?v=5.1.0.
If the user wants to completely ignore all query params and serve cached content to all requests with the same path, they could set ignore(d)QueryParams to *, which is what my PR does.
👍 ?
@balinterdi I was thinking of it more as the opposite.
With your example a call to fonts/fontawesome-webfont.woff2?v=4.7.0 would not match a call to fonts/fontawesome-webfont.woff2?v=5.1.0 because the v is configured to be included. However, a call to fonts/fontawesome-webfont.woff2?kermit=frog and fonts/fontawesome-webfont.woff?kermit=mc would return the same thing because you have explicitly defined that you only want v to be included.
I also proposed an application-wide setting that would handle those that aren't explicitly defined like assets/admin-engine.js. If you had ignoreQueryString set to true then calls to assets/admin-engine.js?gonzo=daredevil would match assets/admin-engine.js?gonzo=chicken-lover. If you had it set to false that the matches would explicitly match against all query parameters so these two would be treated as different unique requests.
Gotcha. That makes a lot of sense, I think this is really good.
Cool. So far I have the easy part (mostly because @balinterdi did most of the work). Working on the harder part next although I only have a few more hours today to spend on it. https://github.com/DockYard/ember-service-worker-asset-cache/pull/44