Upload: Auto adjust maxChunkSize until upload succeeds after HTTP-413
Currently when Nextcloud runs behind a server with an upload limit, uploading a large file (e.g. several 100M) will likely fail when nextcloud.cfg is not manually adjusted on every endpoint to limit maxChunkSize (assuming an upload limit is usually not set to 1G but more likely 100M or lower).
This PR implements auto-adjustment of the max upload size when receiving HTTP-413 (request entity too large) or a close connection error after large uploads
This is achieved with the following strategy:
- The config is enhanced with parameter
failsafeMaxChunkSize(default100M). - When
HTTP-413is received:- If
requestSize > failsafeMaxChunkSize, the failsafe max size applies and the sync is retried immediately. - Otherwise
maxChunkSize = requestSize / 2and the sync is retried until it succeeds orminChunkSizeis reached (default1M).
- If
- A connection close error is treated like
HTTP-413ifrequestSize > failsafeMaxChunkSize. HTTP usually requires that requests are fully consumed, servers may decide to close the HTTP connection when too much data is sent to prevent this. As connection close is not a specific error, only large uploads are handled by this change.
While failsafeMaxChunkSize can be adjusted in nextcloud.cfg or via env var OWNCLOUD_FAILSAFE_MAX_CHUNK_SIZE, it should have a reasonable default so that it isn't required. When set properly it limits the amount of retries needed and enables uploads for the cases where the HTTP connection is closed as response to sending too much data.
For cases where an upload limit in the web server is greater 1M and a connection close error is only received for uploads >100M, the default values allow all uploads to succeed without any additional configuration.
This should fix most cases reporting errors in this area like #4278 and #925.
For 2 concurrent uploads (~0.5GB each) and the nginx frontend limited to 100M this looks like this (in an earlier implementation without failsafeMaxChunkSize):
2022-08-06 22:18:06:368 [ warning nextcloud.sync.credentials.webflow ./src/gui/creds/webflowcredentials.cpp:229 ]: "Error transferring https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000001 - server replied: Request Entity Too Large"
2022-08-06 22:18:06:369 [ info nextcloud.sync.networkjob.put ./src/libsync/propagateupload.cpp:87 ]: PUT of "https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000001" FINISHED WITH STATUS "UnknownContentError Server replied \"413 Request Entity Too Large\" to \"PUT https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000001\"" QVariant(int, 413) QVariant(QString, "Request Entity Too Large")
2022-08-06 22:18:06:369 [ debug nextcloud.sync.propagator.upload ./src/libsync/propagateupload.cpp:662 ] [ OCC::PropagateUploadFileCommon::commonErrorHandling ]: "<html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>nginx/1.22.0</center>\r\n</body>\r\n</html>\r\n"
2022-08-06 22:18:06:369 [ info nextcloud.sync.database ./src/common/syncjournaldb.cpp:1798 ]: Setting blacklist entry for "test.img.gz" 1 "Upload of 316 MB exceeds the max upload size allowed by the server. Reducing max upload size to 158 MB" 1659817086 0 1627988766 "" "" 0
2022-08-06 22:18:30:645 [ warning nextcloud.sync.credentials.webflow ./src/gui/creds/webflowcredentials.cpp:229 ]: "Error transferring https://nextcloud/remote.php/dav/uploads/user/2750000573/0000000000000000 - server replied: Request Entity Too Large"
2022-08-06 22:18:30:645 [ info nextcloud.sync.networkjob.put ./src/libsync/propagateupload.cpp:87 ]: PUT of "https://nextcloud/remote.php/dav/uploads/user/2750000573/0000000000000000" FINISHED WITH STATUS "UnknownContentError Server replied \"413 Request Entity Too Large\" to \"PUT https://nextcloud/remote.php/dav/uploads/user/2750000573/0000000000000000\"" QVariant(int, 413) QVariant(QString, "Request Entity Too Large")
2022-08-06 22:18:30:645 [ debug nextcloud.sync.propagator.upload ./src/libsync/propagateupload.cpp:662 ] [ OCC::PropagateUploadFileCommon::commonErrorHandling ]: "<html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>nginx/1.22.0</center>\r\n</body>\r\n</html>\r\n"
2022-08-06 22:18:30:646 [ info nextcloud.sync.database ./src/common/syncjournaldb.cpp:1798 ]: Setting blacklist entry for "Test.ISO" 1 "Upload of 158 MB exceeds the max upload size allowed by the server. Reducing max upload size to 80 MB" 1659817110 0 1418024480 "" "" 0
2022-08-06 22:18:31:467 [ warning nextcloud.sync.credentials.webflow ./src/gui/creds/webflowcredentials.cpp:229 ]: "Error transferring https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000002 - server replied: Request Entity Too Large"
2022-08-06 22:18:31:467 [ info nextcloud.sync.networkjob.put ./src/libsync/propagateupload.cpp:87 ]: PUT of "https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000002" FINISHED WITH STATUS "UnknownContentError Server replied \"413 Request Entity Too Large\" to \"PUT https://nextcloud/remote.php/dav/uploads/user/2789530175/0000000000000002\"" QVariant(int, 413) QVariant(QString, "Request Entity Too Large")
2022-08-06 22:18:31:467 [ debug nextcloud.sync.propagator.upload ./src/libsync/propagateupload.cpp:662 ] [ OCC::PropagateUploadFileCommon::commonErrorHandling ]: "<html>\r\n<head><title>413 Request Entity Too Large</title></head>\r\n<body>\r\n<center><h1>413 Request Entity Too Large</h1></center>\r\n<hr><center>nginx/1.22.0</center>\r\n</body>\r\n</html>\r\n"
2022-08-06 22:18:31:467 [ info nextcloud.sync.database ./src/common/syncjournaldb.cpp:1798 ]: Setting blacklist entry for "test.img.gz" 1 "Upload of 158 MB exceeds the max upload size allowed by the server. Reducing max upload size to 80 MB" 1659817111 0 1627988766 "" "" 0
After 3 attempts uploads succeed.
Hope this is accepted. Feedback is welcome.
Codecov Report
Merging #4826 (9ef12fb) into master (3dc583c) will decrease coverage by
3.61%. Report is 80 commits behind head on master. The diff coverage is95.45%.
:exclamation: Current head 9ef12fb differs from pull request most recent head 1cb6a19. Consider uploading reports for the commit 1cb6a19 to get more accurate results
Additional details and impacted files
@@ Coverage Diff @@
## master #4826 +/- ##
==========================================
- Coverage 60.79% 57.18% -3.61%
==========================================
Files 145 138 -7
Lines 18836 17187 -1649
==========================================
- Hits 11451 9829 -1622
+ Misses 7385 7358 -27
| Files | Coverage Δ | |
|---|---|---|
| src/libsync/account.h | 52.00% <100.00%> (+9.14%) |
:arrow_up: |
| src/libsync/configfile.cpp | 26.25% <100.00%> (-1.78%) |
:arrow_down: |
| src/libsync/owncloudpropagator.cpp | 86.32% <100.00%> (+1.79%) |
:arrow_up: |
| src/libsync/propagateuploadng.cpp | 83.95% <100.00%> (-1.10%) |
:arrow_down: |
| src/libsync/propagateuploadv1.cpp | 69.44% <100.00%> (ø) |
|
| src/libsync/syncfileitem.h | 38.59% <100.00%> (-4.96%) |
:arrow_down: |
| src/libsync/syncoptions.h | 100.00% <100.00%> (ø) |
|
| src/libsync/propagateupload.cpp | 77.16% <93.33%> (+1.62%) |
:arrow_up: |
| src/libsync/syncoptions.cpp | 71.73% <90.90%> (-2.73%) |
:arrow_down: |
Hi @allexzander, are you ok with the updates to your review or do you expect additional changes?
Thanks for your pull request :+1:
It's a good improvement to handle a 413 response properly. I wonder if it's possible to make this feature work without a new configuration option. When I read the existing code correct we have some calculation to increase the chunk size step by step. Could we store the last working chunk size and fallback to it on 413 response? I guess there is nothing wrong about an additional configuration but we already have 4 for the chunked uploads :thinking:
Hi @kesselb, I see your point.
The reason for the config value is primarily to have a starting point for the auto-sensing that doesn't hurt performance too much (since both cases, connection close and HTTP-413 are handled the same for uploads larger failsafeMaxChunkSize).
Guess what can be done is to have a built-in range for failsafeMaxChunkSize (e.g. from 100mb down to 25mb) that is evaluated at the first failures and then remembered in the KV store. If uploads still fail, auto-sensing would go down to minChunkSize but keep this only until the next restart and start from the remembered failsafeMaxChunkSize again.
This ensures that performance cannot get too bad or stay bad when limits are changed again.
Only question is whether this should be reset at some time or stay in the DB forever?
Hi @kesselb, @allexzander,
Have updated the implementation as follows:
- On client restart
_minand_maxchunk size applies (always), until the first failure. - On the first failure, the remembered
_failsafechunk size applies and is used as start for auto-sensing. - The sensed
_failsafechunk size is remembered down toLowestPossibleFailsafeMaxChunkSize(25mb). - Lower limits than this are not persisted and always require auto-sensing after a client restart but this ensures a miss-configuration of the server cannot persist values that would hurt upload performance seriously.
- In addition to
_failsafesize, also the_initialchunk size is stored as this eliminates the initial upload with10mthat would otherwise happen with every new sync job.
The latest commit addresses _initial chunk size to be persisted only when chunk size is dynamic (and supported). However I guess this value is better kept as runtime value in account() and probably should be done in a separate PR. For reference I have not removed it.
AppImage file: nextcloud-PR-4826-c78288e61d4b4582aba6f55952f1d3b0e09ab82f-x86_64.AppImage
To test this change/fix you can simply download above AppImage file and test it.
Please make sure to quit your existing Nextcloud app and backup your data.
@mgallien WDYT?
@mgallien as this is also causing clients to fail to sync, perhaps it'd be good to make it part of the fixes for reliable syncing?
If it takes more time to finish this pull request, we could also revert https://github.com/nextcloud/desktop/pull/3600
maxChunkSize = 1000 is only possible when the server accepts such a big file.
If it takes more time to finish this pull request, we could also revert #3600
maxChunkSize = 1000 is only possible when the server accepts such a big file.
reverting it while a better fix is worked on would be appreciated.
Btw. I'm happy to update the PR if anyone can find the time to review it. Nevertheless reducing maxChunkSize in the meantime is a quick and safe fix.
@claucambra, will look into it this week
Rebased and simplified the changes. Tests need an update (are failing at the moment). Will take care of it in the next few days.
@claucambra, I'm done from my side. The remaining check failures seem to be unspecific to this PR (are failing for others also).
The PR does not introduce new configuration and does not persist anything. Everything is kept as runtime states in the Account instance and its lifetime. If something should be persisted, it could be done from there but then also the lifecycle would need to be managed which is very likely to complex for this matter.
Comment removed. Unrelated. Moved to separate issue.
@asheroto, please remove your comment from this PR, it is unrelated. HTTP-412 is sent for missing headers in chunked uploads, this PR is about upload size and it isn't contained in the agent yet.
You might need to create a separate issue.
Sorry will do. I was just trying to intercept a PR if it contained the bug but am not familiar enough with the project to know. 😊
I will create a separate issue.
@allexzander, could you please run the Windows CI tests one more time to have updated output? Thanks!
Compiled & tested with jkellerer/building-windows but tests don't fail. Need to try harder.
AppImage file: nextcloud-PR-4826-09be2f7e8139a39a115396d77a8cf666b548bcb4-x86_64.AppImage
To test this change/fix you can simply download above AppImage file and test it.
Please make sure to quit your existing Nextcloud app and backup your data.
This is bs man. Limiting uploads literally ruins the whole point of cloud storage.
This is bs man. Limiting uploads literally ruins the whole point of cloud storage.
This doesnt limit upload at all. It just pushes the upload as smaller chunks.
You can still upload big files with little to no effect on performance. The benefit of this is to allow it to work with certain network tunnels
Hi team, what is needed to get this to the finishing line & get it merged?
@jospoortvliet, missing are fixes for unit tests. In the test system I had access to: Win Server 2016 (x64), MacOS (x64), I had no test failures (for the parts that are supported, Win2016 seems missing some parts, but the test that failed here was working fine).
Another Win11 instance created some other issues but that ran on ARM64, so not the most reliable result. To be honest I wasn't so sure that the changes made here are affecting the test failures as the results vary quite a bit depending on the used OS.
Is it possible to revive this PR?
I had to fiddle around the chunk size config on my fresh new system to get it to work. It seems like this PR would modify the parameters on the fly dynamically, which could solve most such problems where the reverse proxy or whatever doesn't like the ootb chunk size.
Adding the following configs helped me in my current case on mac:
chunkSize=10000000 minChunkSize=1000000 maxChunkSize=50000000 targetChunkUploadDuration=6000
Up! Please take a look at this team!