joomla-cms
joomla-cms copied to clipboard
Chunked downloads of update packages
Pull Request for Issue #36741 .
Summary of Changes
You can now tell Joomla Update to download the upgrade packages one megabyte at a time. This allows the update to download on very slow servers (see the linked issue).
Tagging @zero-24 @laoneo @HLeithner
Why bother on Joomla 3?
First of all, the majority of Joomla sites are still on Joomla 3. It will take a long while for everyone to migrate to Joomla 4 and beyond. Let's not forget that there are still 3PD extensions which are still in the process of migrating to J4. Most sites can't migrate until it's clear which extensions can finish their migration and which are left behind to die.
Right now there are servers which fail to download the Joomla update package. If there is a security issue with Joomla 3 it will be very hard (if not impossible, depending on the site owner's skill level) to update these sites, exposing them to a security risks. We don't even provide an alternative method, like we do in Joomla 4 (you can update J4 over CLI where waiting even for half an hour for the update package to download is not a practical problem!). Therefore it makes sense to start by addressing this issue on Joomla 3 even though it's currently on security maintenance only.
If this approach works for y'all (and the production leadership ultimately decides that they cannot use CloudFlare as the default mirror which would solve this problem on most servers!) I can port this to Joomla 4.
Testing Instructions
Common instructions
- Install a new site
- Apply the patch
- Change the
manifest_cache
in the#__extensions
table forextension_id
700 to have a fake version, e.g. 3.10.10. - Change the
libraries/src/Version.php
with PATCH_VERSION = 10 - Log into the site's backend
- Go to Components, Joomla Update
Test 1
Makes sure that updates without any further configuration still work the same way as they did before.
- Follow the Common instructions
- Click on Check for Updates
- An update is found
- Click on the button to install the update
- You wait for a while, the progress bar moves to 100% and the update installation starts.
Test 2
Makes sure that chunked downloads work.
- Follow the Common instructions
- Click on Options
- Click on Fine-tuning
- Set Chunked Downloads to Yes
- Click on Save & Close
- Click on Check for Updates
- An update is found
- Click on the button to install the update
- You will see the progress bar move every 3 seconds or more
- Once the download is complete the progress bar shows 100%, it becomes green and the update installation starts.
Test 3
Makes sure the fine-tuning setting for the minimum time works.
- Follow the Common instructions
- Click on Options
- Click on Fine-tuning
- Set Chunked Downloads to Yes
- Set “Minimum time between download steps (seconds)” to 10
- Click on Save & Close
- Click on Check for Updates
- An update is found
- Click on the button to install the update
- You will see the progress bar move every 10 seconds or more. The download is substantially slower than test 2.
- Once the download is complete the progress bar shows 100%, it becomes green and the update installation starts.
You can redo this test with the minimum time set to 0 and observe that the download is much faster — unless your server throws an error because of too many requests…
Documentation Changes Required
Add the following to the help page for Joomla Update options.
The Fine-tuning tab's options control how Joomla Update will download the update packages before installing them on your site.
By default, Joomla Update tries to download the entire update package in a single page load. If your server has a slow network connection to the Joomla Project's update download servers it may time out before the download is complete, making it impossible to update your site.
If this happens to you — and only if it happens! — you should change the “Chunked Downloads” option to Yes. Joomla Update will now download 1 megabyte of the update package at a time, minimising the risk of your server timing out.
The “Minimum time between download steps (seconds)” controls how fast these 1 megabyte downloads will come. If the last attempt to download 1 megabyte of the update takes less time than this setting is configured to, Joomla Update will wait for the remainder of the time. Setting this to 0 will make downloads very fast but there is a chance that your hosting company's server protection may block you and break the updates. If you receive an error from your server while the download is in progress increase this setting. If you do not see any errors try decreasing this setting to make downloads faster. The default value of 3 seconds strikes a good balance between download speed and working around server protections which would block the update package download on most commercial hosting companies.
Translation Changes Required
This PR introduces 10 new language strings.
Important notes
This is for @HLeithner mainly and to explain why this option is disabled by default.
Every Joomla update provides a set of mirrors (alternate download URLs). The first mirror is the Joomla downloads site which runs a very old version of Akeeba Release System and is linked to an Amazon S3 bucket.
The items in this bucket are stored with Private ACL. ARS on the downloads site returns a pre-signed URL which allows only one specific HTTP verb: GET. There is no way to allow alternate verbs, such as HEAD.
To make chunked downloads possible we need to do a HEAD request on the mirror's download URL and follow all redirections. When we find the final destination we perform a final HEAD request and expect to see the HTTP Content-Length header which tells us how big is the file to download. Since the first mirror redirects to S3 using a pre-signed URL and the pre-signed URL does not allow HEAD requests we can not use the default mirror for chunked downloads. Therefore we will fall back to the next mirror which is GitHub and, if I recall correctly, GitHub wasn't very happy with Joomla using it for all its downloads.
In an attempt to alleviate the pain, I have implemented a simple workaround. If the user has NOT opted into chunked uploads I do NOT perform any HEAD requests. Instead, I simply get the list of mirror URLs and try them one after another, until one successfully downloads the entire file. Therefore the majority of users who are still on the single part download will be able to use the default mirror (hosted on S3) and all is good in the world.
If you want to make chunked downloads the default or only option you MUST do one of the following:
- Set the default mirror to the CloudFlare one. CloudFlare does not have bandwidth caps and does report the Content-Length correctly. Moreover, it has edge nodes in a huge number of locations, meaning that the download will be faster. It also does not hit the S3 bucket which means that it's also cheaper. Finally, if you use the Always Online feature in CloudFlare you can make sure that the update downloads work even when the Joomla server is down. Like, seriously, if you have CloudFlare why are you even using anything else?!
- If you do not want to use CloudFlare (why?!) you MUST set the ACLs of all download items on the S3 bucket to Public and migrate all
s3://
pseudo-URLs in the database tohttp://
orhttps://
URLs pointing directly to the file on the S3 bucket and change the item type tolink
. While certainly possible, I do not see why bother when you have a cheaper, faster, more reliable method.
@nikosdion Drone reports code style errors: https://ci.joomla.org/joomla/joomla-cms/57975/1/3
Hopefully the new commit will do the trick.
Joomla 3 is using such old versions of the code quality tools that I no longer have a PHP version old enough to run them on when developing. So, I have to wait for Drone to finish the tests which takes quite a while.
Test 2 - after 99% completion, I got an error in the popup window. I logged out of the site, cleared the cache, logged in again, repeated the test - the same error, only now the execution breaks immediately, and not 99%.
The settings were set to 3 seconds.
5.5.5-10.5.15-MariaDB PHP 7.4.29 Joomla "changed" to 3.10.10
@Kostelano Oops, I had made a change to develop this feature without the update actually running. I accidentally committed that in the repo I made the PR from. I fixed it now. Can you please retest?
The fix works. Test 2 was successful.
@Kostelano Thank you for retesting!
~~Now proceeded to 1 - I DISABLED the parameter, but when updating, firstly, exactly the same window is called as in test 2 (as far as I understand, the window should also be disabled ...?), secondly, exactly the same error, like the posts above.~~
Sorry, updated an old patch. Ehhhh
No worries :) To clarify, in all of the tests you will see the new download UI. However, with the parameter disabled it goes from 0% straight to 100%. With the parameter enabled it progresses through the bar. The current 3.10.11 update is around 12Mb and we download 1Mb at a time so you see the bar progress 100/12 = 8.33% at a time.
If you think this is unclear let me know, I can update the PR description.
I checked all 3 scenarios for the test. Everything works well. I am not sending a successful test yet, there will probably be some discussion and changes.
The only thing is that the download page of the package seemed a bit inconsistent in terms of design to me. I understand that this is a rather individual question. I would prefer the margins to fit the width of the screen (similar to Joomla's refresh page). Or, as an option, implement something similar as in Joomla 4 (in one line with execution, bytes, etc.).
@Kostelano The design of the page is better than the update page in 3 which is simply based on the fact that nobody wrote any CSS the last 10 years... What you see in the download page is the best you can do with the ten-year-old Bootstrap 2 still in use in Joomla 3. We can, of course, change the design of the update page to match. You can't do a single row without using Flexbox which was not an option in BS2 and I didn't want to add too much custom CSS.
Regarding the margins, I think that the 1200px max width is better than the insanity in Joomla 4. In J4 if you have a 48" ultrawide monitor and stupidly maximise the window it will happily take over an entire meter (1m) of screen width which is completely ludicrous. You can't read that information. It's silly. If we want Joomla to be silly let's add dancing unicorns and falling snowflakes on the update page like it's 1998 all over again, you know?
I have tested this item :white_check_mark: successfully on 6fcb71a1a105f12cdd6c244c47c88a0c085319d5
Tested it successfully on a web hosting where the joomla update timed out before. They fixed the slow download partially so a call like wget https://downloads.joomla.org/cms/joomla4/4-2-2/Joomla_4-2-2-Stable-Update_Package.zip?format=zip
takes now only about 2 minutes. When running the update through the Joomla updater with this patch it finishes in around 20 seconds, but on j3, as I used the branch from this pr. So I welcome very much this change, even when I see here again no reason for an option. But in this case here it looks like it is needed, so no objection on this.
Hopefully our infrastructure can be changed that the chunked downloads will become the default behavior, as it looks much more professional and stable.
Good job!
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38774.
I have tested this item :white_check_mark: successfully on 6fcb71a1a105f12cdd6c244c47c88a0c085319d5
As I wrote above, all 3 scenarios for testing work well.
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38774.
RTC
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38774.
There is a line in the logs with information about fragment loading. The first fragment is not 1, but 0, then immediately 2. Something is wrong.
There is a line in the logs with information about fragment loading. The first fragment is not 1, but 0, then immediately 2. Something is wrong.
The range looks ok, there is no gap, so it seems only the fragment number 2 is counted wrong.
Back to pending, see previous comments.
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38774.
@Kostelano
There is a line in the logs with information about fragment loading. The first fragment is not 1, but 0, then immediately 2. Something is wrong.
No, this is right. The first step is chunk -1 which tells the code to initialise the download. The next steps are 2, 3 etc for the byte range to be calculated correctly. As you can see the byte range is correct.
I cannot change the way chunks work because we'll end up breaking the download. In the grand scheme of things, having correct downloads is infinitely more important_ than having a number in the logs nobody will look at. So what I will do is change the message for the first chunk.
@Kostelano Could you test again just if the log is fine now? The rest hasn't changed, so if the log is ok I will set RTC again. But I don't have the time now to test it myself. Thanks in advance.
Alas, it's broken.
Sorry, it's a bit hectic around here with my daughter back from school. I mean to use max, not min.
The last commit fixed it, everything is good now. Thank you.
RTC
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38774.
I suggest different aproach, to keep thing easy: Catch failed download, and show a message to User, suggest to download and install the package manually, the message should contain a link to release page.
How to catch failed download: Before download are started the Updater set a session flag "updater doing download", and remove it after download are succeed. After the server goes timeout and whole thing crashes, User will try refresh the page. The updater check for the flag and if it still present then show a (warning) message to User: "Your ISP is Potato please try download and install packge manualy, and here is the link to download"
@Fedik Your approach is nonsensical and user-hostile! First of all, the user would wait forever until they are met with a server error page, thinking Joomla is broken. Then they need to go back to Joomla where they are met with a message that they need to download the file manually. However, their server is so crap that they can't even upload the (VERY) big Joomla update package. Therefore they are left with no update route. User's conclusion: Joomla sucks, we're moving to WordPress.
With my approach, even crappy servers can successfully download the update archive, however slowly, and run the update just fine. User's conclusion: My host sucks, but Joomla is a feckin' rockstar!
Also for your information: my approach didn't come from outer space. This is what I am doing in my own software (Akeeba Backup Professional) when you are downloading very large backup archives (think hundreds of MB to several GB big) from a remote storage back to your server. I don't tell people “aw, your file is too big, sucks to be you”. I download it a few MB at a time. It takes longer but it completes successfully.
The whole point is to provide the user with a VIABLE alternative, not tell them that it sucks to be them.
@zero-24 one for you. But I think this is worth merging
I agree with George, worth merging. We make sure that when we have a security update that people can download and install it.
As agreed on last and todays meeting there are no plans to merge this into J3 but only J4. There has not been an update posted here as we where awaiting a analyse of concerns raised by @HLeithner that I still have not got details on.
@zero-24 I have already addressed the concerns raised by @HLeithner in the “Important notes” part of my PR description when I submitted the PR over a month ago.
I also disagree that this shouldn't be merged on 3.10. This is where the pain is, not 4 (I have not even ported this code to 4 yet and, frankly, never will seeing that the project does not care).
Since this PR is only for Joomla 3 I am closing it with a mental note not to bother trying to fix Joomla.
@zero-24 I have already addressed the concerns raised by @HLeithner in the “Important notes” part of my PR description when I submitted the PR over a month ago.
I saw that this is why i wanted to wait for Harald to get back to me first with his concerns as that was two weeks ago he raised it to the team, I have not got any as mentiond above.
I also disagree that this shouldn't be merged on 3.10. This is where the pain is, not 4 (I have not even ported this code to 4 yet and, frankly, never will seeing that the project does not care).
The pain is more on J4 as there we have much larger packages that could be have issues with limited download capatatcies. Do not care is not fair but you know that too.
Thanks for your effort on this hopefully it can someday be ported to J4 and J5 so the update process can be improved for the future especially for regions/servers not having such a good uplink to all CDNs.
The pain is more on J4 as there we have much larger packages that could be have issues with limited download capatatcies.
@zero-24 As updating from J3 to the latest J4 version happens with the bigger update zip files and still many sites are on J3 for whatever reason I think it would be good to have that in 3.10.