Allow cache busting string in `theme_image()`
This is the respective issue for https://www.drupal.org/node/2313539, which was originally filed against D7 core; currently against D8 core.
The problem came up while working in #4104, where thumbnail images generated for the flexible layout templates would not refresh, unless you did a hard-refresh on your browser (CRTL+F5 / CMD+SHIFT+R).
Problem/Motivation
When working on a module that uses File Entity to handle media assets we've discovered that currently, there is no way to bypass browser file cache in
theme_image()andtheme_image_style(). File Entity provides users with a way to replace their file with a new one, without changing the name of the file. Because filename remains the same, it's still cached in browser and will not show up until clearing page cache with CTRL+R.This might be fine and we're used to it on the user facing site, but it might confuse authors who just replaced their image with a new one, and it's not showing the latest image in admin interface. It would be nice to have a way to bust browser cache that works with Drupal's theming functions, so we don't reinvent the wheel.
PS: also mentioned in https://www.drupal.org/project/file_entity/issues/1701924
Update: It turns out that the D7/8 solution only adds timestamps to managed files, whereas the same problem where the browser cache may serve stale versions of images can occur with unmanaged files too. This is common with files used in the admin UI, and it's the situation in #4104; it has also been reported as a problem by contrib modules (https://www.drupal.org/project/drupal/issues/2800731 for example).
- PR by @klonos (adds a timestamp only to managed images): https://github.com/backdrop/backdrop/pull/2916
- Alternative PR by @klonos (adds timestamp to both managed and unmanaged images): https://github.com/backdrop/backdrop/pull/2920
I have filed a PR to backport the latest D8 patch for this. I have fixed as many tests as possible, but I have now reached a point where my test-writing skills do not help. Anybody want to give this a crack, please feel free.
Just an idea: how does your PR (port) handle external images?
One of the tests failing is "User picture source is correct" (UserPictureTestCase->testExternalPicture), which checks for an external user image.
Others seem to fail on images provided by modules (feed.png).
And as there are some SimpleTest tests failing - could it be, we need to adapt the xpath settings of SimpleTest? (Not sure about that.)
I've left a comment in your PR.
One of the tests failing is "User picture source is correct" (UserPictureTestCase->testExternalPicture), which checks for an external user image.
Yup, I tried to check that one out, and it brought back memories (of UX nightmares we have inherited from D7) 😓 ...I have filed #4110 to replace that with a proper image upload widget.
I've left a comment in your PR.
That was it @indigoxela!! ...tests are green 🎉
...I am raising the priority of this bug fix to high, since it's a blocker for #4104 working properly.
@klonos I've just tested your PR locally - it does what it's expected to do. The src attributes of image fields get the timestamp attached as query string. Other images (not using theme_image with $timestamp) are left untouched.
Thanks @indigoxela 👍 ...I'm about to test this on my local, with #4104, to see if it addresses the issue with the thumbnails not being updated upon update.
Other images (not using theme_image with $timestamp) are left untouched.
Hmm 🤔 ...then this doesn't help with #4104 as I expected it to. I would like it if all images would get this timestamp token.
Other thoughts:
- At my work, we often get support tickets raised because the aggressive caching (Akamai) shows outdated files of all types (usually government .pdf documents). I am thinking that perhaps we should be adding a timestamp token to all file links.
- This timestamp gets appended to the URL in the form of
?timestamp=1570291373. a) do we want to keep it astimestamp, or change to something shorter perhaps?, and b) should we be using a yyyymmddhhmmss representation instead of a UNIX timestamp? (would make better sense to humans when reading it, re file creation date).
OK, so the "problem" is that the current solution in D7/8 accounts only for managed files, but this stale image cache situation can also happen with unmanaged files (which is the case with #4104). I have updated the issue summary to point that out.
I have a working solution on my local, but I will file a separate PR for it, so that both solutions can be reviewed/considered.
...OK, here's the alternative PR: https://github.com/backdrop/backdrop/pull/2920
There is a performance thing to consider: can we take for granted that managed files always have a timestamp entry in the db? This would avoid the need for the check that is calling file_load_multiple() for each image. We would only need to check whether the image is unmanaged, and if so, generate a timestamp for it on-the-fly (using php's filemtime()).
This is a PR with the changes from the original PR + #4104 merged: https://github.com/backdrop/backdrop/pull/2919 ...if you test things out in the PR sandbox, you will notice that the same issue with the stale layout template thumbnails still exists 👎 (because the timestamp is added only to the managed images - and the thumbnails are not managed).
Here'a another PR, with the changes from the alternative PR + #4104 merged: https://github.com/backdrop/backdrop/pull/2921 ...if you test things out in that PR sandbox, you will notice that the same issue with the stale layout template thumbnails is solved 👍
@klonos I'm trying to keep up with your plans/ideas.
Your previous PR https://github.com/backdrop/backdrop/pull/2916 works fine for managed files. Tests are passing. Your latest experiments seem broken. I'm not really sure, what you're trying to accomplish.
For https://github.com/backdrop/backdrop-issues/issues/4104 you need cache busting for a specific unmanaged file. For the generated layout thumbnail it might make sense to always attach a query string to the src attribute (use request time for that...?). That should be easy in core/modules/layout/layout.theme.inc.
Or are you planning to add that functionality to all images globally? I don't see how this could work. And it maybe doesn't make much sense either.
Obviously - I don't get it yet. ;) Mind to explain a bit?
I think the ability to append a timestamp or cache-busting string is a good feature to have. Though I am concerned about impacts this may have (which may or may not be valid):
- We need to load managed files from the database in order to get the modified time. Seems like this is a new SQL query per file displayed on the page(?)
- If it's not a managed file, we're appending
filemtime(), which hits the disk to get the modified time. I've seen situations where the OS sometimes can't get an accurate modified time at all, such as when using weird file systems like Gluster or the S3 module. In which casefilemtime()may return a different value on every page request (thus making images uncacheable in the browser). - Automatically appending the timestamp everywhere seems aggressive, OTOH we'd pretty much have to do this in order for it to work when replacing an image with one of the same file name.
All things combined, I'd like to know if merely adding the ability would be sufficient here (it would at least unblock #4104).
@klonos I'm trying to keep up with your plans/ideas. ...Your latest experiments seem broken. ...Or are you planning to add that functionality to all images globally?
Sorry 😓 ...I plan to work on fixing the remaining test failures soon. ...so yes, all images globally I guess. #4104 was what has brought this issue here to light. Implementing the same fix as D7/D8 does not seem to fix #4104, so I made the last change, which was to apply timestamps to unmanaged files too. That fixes the problem caused by the browser cache in #4104 (and future-proofs against similar issues in the future).
All things combined, I'd like to know if merely adding the ability would be sufficient here (it would at least unblock #4104).
Yes, the initial proposal in the d.org issue was to allow for something like this:
$variables = array(
'path' => 'path/to/img.jpg',
'alt' => 'Test alt',
'title' => 'Test title',
'width' => '50%',
'height' => '50%',
'attributes' => array('class' => 'some-img', 'id' => 'my-img'),
// Bypass cache
'cache' => FALSE,
);
$img = theme('image', $variables);
Would the addition of 'cache' => FALSE, be more acceptable here? (would make this an opt-in feature, which could be applied per image/case). I can work towards something like that.
Sounds good. Maybe have timestamp be one of the following values: an integer or string (in which case it is appended) and FALSE or NULL to have no timestamp appended.
I don't think we should have an option to attempt to automatically append a timestamp based on filemtime(), as I don't think it's safe to assume that will work across all types of environments.
thumbnail images generated for the flexible layout templates would not refresh, unless you did a hard-refresh on your browser (CRTL+F5 / CMD+SHIFT+R).
We get around this everywhere else in core by changing the file name. Have we talked about doing that here? A different file every time would avoid needing any cache busting, or changes to theme_image().
We get around this everywhere else in core by changing the file name. ...
That was my original implementation of this, but it feels like a weird/nasty workaround to have this:
https://my.awesome.site/whatever/my-awesome-image-for-thingy-xyz-randomized-1234.png
...instead of this:
https://my.awesome.site/whatever/my-awesome-image-for-thingy-xyz.png?timestamp=1234
This also makes it hard(er) to work with these images (files in general), as there will never be a standard/predictable way to reference them ...not talking about the specific use case in #4104, because in that case the path to the image is being stored in the object/array - I'm talking in general here, and trying to solve this in a "proper" way, so that both core and contrib can benefit from this improvement.
So if timestamps were used in the http query, and I knew that all images for thing xyz were using the pattern [thingy_name].png, I would then know what to expect. If the way these were named was [thingy_name]-[random_thingy_here].png, then I would have to use pattern/regex matching to remove the random thing from the possible names.
This also makes it hard(er) to work with these images (files in general), as there will never be a standard/predictable way to reference them
But, would we ever need to reference these things separate from the layout template itself? I think a file name like 384jweuvfwou.png is probably fine. This isn't a user-uploaded file or something anyone will need to see or manage anywhere else.
It seems like this may be overthinking a problem we don't really have.
But, would we ever need to reference these things separate from the layout template itself?
Again, this issue here is not just about #4104 (which is specifically about the images generated for the flexi templates). This is about https://www.drupal.org/node/2313539, which is a problem we have as well, and which can obviously manifest itself in contrib too (see https://www.drupal.org/project/file_entity/issues/1701924).
Perhaps I am reading/interpreting all that wrong though.
In looking for issues that are "nearly there," I see this one needing only code review (according to its labels). But there are two PRs, https://github.com/backdrop/backdrop/pull/2916 (managed files only) and https://github.com/backdrop/backdrop/pull/2920 (both managed and unmanaged files).
Since we don't seem to have sufficient clarity to narrow things down to a single preferred PR, I'm taking the liberty of adding the label "Needs more feedback".
And a request for @klonos: would you add the magic "Fixes" word to the first comment of both PRs so that GitHub will show them as the candidate PRs for this issue on this issue's page?
Sorry @bugfolder and others - this one slipped under the cracks over the years.
And a request for @klonos: would you add the magic "Fixes" word to the first comment of both PRs so that GitHub will show them as the candidate PRs for this issue on this issue's page?
Done 👍🏼
This is where this issue currently stands to the best of my understanding:
- There are 2 regular/legitimate PRs, one as an alternative of the other:
- https://github.com/backdrop/backdrop/pull/2916 is an attempt to backport https://www.drupal.org/node/2313539. I've fixed conflicts and rebased, and it is currently passing all tests (excluding CSPell checks, which will be addressed with a separate PR in #6302 since they have nothing to do with changes introduced here, and are affecting other PRs too)
- https://github.com/backdrop/backdrop/pull/2920 extends the previous PR to add timestamp to unmanaged "shipped" files as well. I've resolved conflicts and rebased, but there are currently several legitimate test failures that I will need to work on.
- There are another 2 test PRs that should not be merged (which is indicated in the PR titles), and which I have intentionally not linked to this issue - these were for my benefit, in order to help with testing how the changes of either of the legitimate PRs will affect #4104.
- @quicksketch expressed concerns about 3 things in https://github.com/backdrop/backdrop-issues/issues/4107#issuecomment-538775742 which will need to be investigated, and then he also followed up with a suggestion in https://github.com/backdrop/backdrop-issues/issues/4107#issuecomment-538788689 that will need to be implemented.
- @jenlampton suggested to consider appending a randomized token to the filename instead of timestamp queries, however, as I replied, this is not to specifically address the case of #4104 (although chances are that it will solve it as a "side effect"; maybe not though) - this is about backporting https://www.drupal.org/node/2313539 + also consider the same cache busting method for unmanaged/"shipped" files (if there are no performance implications and all concerns @quicksketch expressed have been investigated)
- I have taken another look at the PRs and the patches in d.org, and there's work that needs to be done. There's also things like #6676 that would help to be merged before we work further on this issue here.
I hope that this makes it clear, and less work/research to be done when we revisit this in the future.