add --disable-scanner to backup command
The scanner process has only cosmetic effect for the progress printer, and can be disabled without impacting functionality when the user does not need an estimate of completion.
In many cases the scanner process can provide beneficial priming of the file system cache, so as general advice it should not be disabled. However, tests have shown that backup of NFS and fuse based filesystems, where stat(2) is relatively expensive, can be significantly faster without the scanner.
What does this PR change? What problem does it solve?
This patch allows users to avoid some unnecessary I/O during backup caused by the job size scanner when appropriate.
Was the change previously discussed in an issue or on the forum?
Michael Eischer previously suggest to disable the scanner implicitly for quiet jobs in PR #2336
His testing showed that although user and system time was reduced, real time could in fact increase. Also, the difference was relatively small when run locally on SSD.
Checklist
- [x] I have read the contribution guidelines.
- [x] I have enabled maintainer edits.
- [ ] I have added tests for all code changes.
- [x] I have added documentation for relevant changes (in the manual).
- [x] There's a new file in
changelog/unreleased/that describes the changes for our users (see template). - [x] I have run
gofmton the code in all commits. - [x] All commit messages are formatted in the same style as the other commits in the repo.
- [ ] I'm done! This pull request is ready for review.
Benchmark results
I used Michael Eischer's scripts to generate test data - 1000000 files in 10103 directories, totalling 4 GB. The flushc function was changed to flush cache by unmounting/mounting the filesystem.
Hardware
The test data was stored on an NFS server (EMC Unity). The NFS client was a virtual machine with 10 GiB RAM with two cores running Ubuntu 20.04. The system disk used for the repo and the cache was backed by NVMe on the hypervisor.
The NFS mount was using NFS v4.1 over IPv6 on a 10 Gbps link.
Results
With scanner, cold filesystem cache
| real | user | sys |
|---|---|---|
| 2332.496 | 342.403 | 325.810 |
| 2328.578 | 318.518 | 333.364 |
| 2383.931 | 319.679 | 340.804 |
Without scanner, cold filesystem cache
| real | user | sys |
|---|---|---|
| 1330.922 | 283.043 | 213.047 |
| 1348.730 | 281.155 | 215.864 |
| 1341.197 | 279.321 | 217.178 |
With scanner, warm filesystem cache
| real | user | sys |
|---|---|---|
| 977.128 | 191.971 | 176.165 |
| 981.331 | 193.669 | 176.629 |
| 967.322 | 190.815 | 176.513 |
Without scanner, warm filesystem cache
| real | user | sys |
|---|---|---|
| 1264.398 | 266.654 | 199.560 |
| 1277.799 | 268.134 | 201.512 |
| 1263.922 | 268.960 | 199.372 |
Summary
The numbers are remarkably stable.
We can see that time is significantly reduced by disabling the scanner with a cold filesystem cache. Reductions in time:
| real | user | sys |
|---|---|---|
| -42.9% | -14.0% | -35.4% |
With a warm filesystem cache, the backup runs a little faster (with scanner disabled):
| real | user | sys |
|---|---|---|
| -5.3% | -4.7% | -7.1% |
However, with the scanner, and a warm filesystem cache, we get the fastest times. It's a bit of a mystery to me how even the sys time is lower with the scanner enabled.
| real | user | sys |
|---|---|---|
| -23.1% | -28.3% | -11.8% |
Conclusion
The new flag is not something which should be recommended unreservedly, but when the filesystem is much larger than the client's RAM or when the backup agent is mounting file systems dynamically, it can have a significant positive impact on backup times and resource usage.
thank you for your comments, they all make sense. I did the rename as a separate patch just to make the process clear, but obviously these patches are best squashed during merge.