restic
restic copied to clipboard
dump and cp from mounted filesystem is about 8 times slower than restore
Output of restic version
restic 0.13.1 compiled with go1.18 on linux/amd64
How did you run restic exactly?
theFile is about 2522 MB
time ./restic dump --cacert key latest theFile > temp.txt
results in
real 1m3,982s
user 0m44,149s
sys 0m7,611s
time ./restic restore --cacert key latest --iinclude theFile -t restoretest results in
real 0m7,994s
user 0m37,991s
sys 0m4,723s
mounting and then using
time cp theFile /target
results in
real 1m9,124s
user 0m0,032s
sys 0m4,560s
Using cat
real 1m9,018s
user 0m0,021s
sys 0m4,752s
system is pretfy fast using
time cat theFile > theFile2
results in
real 0m1,352s
user 0m0,004s
sys 0m1,347s
Update:
Restoring the same file on another host via GBit network connection and restic 0.13.1 (v0.13.0-267-g6cbeb4a9) compiled with go1.18.1 on linux/amd64
real 1m10,923s
user 0m40,697s
sys 0m15,149s
What I've noticed is that there is only about 300 MBit/s network usage, wheras with restore there is full usage of GBit network connection
real 0m24,490s
user 0m34,982s
sys 0m7,686s
Update:
Doing similar on arm64 with restic 0.13.1 (v0.13.0-267-g6cbeb4a9) compiled with go1.18.1 on linux/arm64 with a 400 MB file results in for restore
real 0m34,408s
user 0m55,729s
sys 0m5,111s
for dump
real 0m48,114s
user 0m53,511s
sys 0m4,424s
for cp
real 0m38,672s
user 0m0,008s
sys 0m2,060s
What backend/server/service did you use to store the repository?
rest-server on the same system as restore was tested
Expected behavior
At least restic dump should be almost fast as restore command. For the mounted filesystem probably fuse is the culprit, since filesystem is running in userspace. But still that is quite slow and I'm quite sure there could be some optimizations done. If this is possible this is from my point of view much more helpful than for dump command.
On windows I did a test with a slower rest-backend, needing about 30 seconds to restore a 850 MB file. There with restic 0.13.1 (v0.13.0-251-g98a3125c) compiled with go1.18.1 on windows/amd64 dump and restore showed no significant difference in speed. Using also restic 0.13.1 (v0.13.0-251-g98a3125c) compiled with go1.18.1 on linux/amd64 on the linux system with the repo with the 2522 MB filedump was as slow as with the current 0.13.1 release version. Also restic 0.13.1 (v0.13.0-267-g6cbeb4a9) compiled with go1.18.1 on linux/amd64 showed also slow dump behaviour.
Actual behavior
dump command and copying from mounted file system are very slow.
Steps to reproduce the behavior
Create a repository, add a big file and restore or dump.
Do you have any idea what may have caused this?
Perhaps it has something to with caching or how big the chunks are that are requested. With that 2522 MB it is very likely that it has a lot of dedulication with other content in the repo. This might also play a role.
Using mounted repo slow speed might be a limitation of fuse file system since it is in user space and has to to a lot of context switches?
Do you have an idea how to solve the issue?
No.
Did restic help you today? Did it make you happy in any way?
Restic makes me happy every day since it keeps my data safely stored.
At least restic dump should be almost fast as restore command. For the mounted filesystem probably fuse is the culprit, since filesystem is running in userspace. But still that is quite slow and I'm quite sure there could be some optimizations done. If this is possible this is from my point of view much more helpful than for dump command.
The part with dump being slow is a duplicate of #3406. The problem here is that a 2.5GB file consists of roughly 2k chunks which have to be requested sequentially. At 70 seconds, that's 35 milliseconds per chunk which is already quite good performance wise (about 10 milliseconds to just transfer one chunk, plus reading it from disk plus some other overhead). The mentioned issue suggests parallelizing the dump command, which would fix that performance problem.
For the mounted filesystem, this is unfortunately a much harder problem: using the normal filesystem API we have no way of guessing whether some processing reading a file wants to read everything or just a small part of the file. You could however try whether the following helps with performance:
Increase the readahead in https://github.com/restic/restic/blob/d9ea1e9ee2a994eb34b54ca292459c751786d2e4/cmd/restic/cmd_mount.go#L126
I'm not sure whether this will work, as according to the manpage (https://manpages.debian.org/jessie/fuse/mount.fuse.8.en.html) the kernel enforces a default readahead limit of 128kb controlled by max_readahead. But there's currently no easy way to pass that parameter when creating the mountpoint.
Thanks for the fast reply and the hint for https://github.com/restic/restic/issues/3406 using keywords "dump slow" did not reveal that issue.
Regarding fuse, I've build restic with systemFuse.MaxReadahead(128 * 1024 * 8) but no difference in speed occured. I also set in /etc/fuse.conf max_readahead=1048576 and used the custom build version, but still no difference.
So this looks like a kernel limitation. Good to know that restore command provides fastest restore options.
Question remains: Close or not close this issue regarding fuse? Hopefully this kernel limitation will not be present in future kernels.
Now that I'm thinking of it: what we'd actually need to speed-up fuse is the following: a readahead for several dozen MBs which would have to be requested from restic as a single read operation. Then our fuse implementation could start to load blocks concurrently. That might close the performance gap somewhat, but probably won't be able to close it completely.
Let's keep this issue open for now, as optimizing the fuse performance would at least be nice to have.
I added concurrent readahead of a single blob to fuse.openFile.Read and it gets about twice as fast when loading from a USB disk. I'm getting a deadlock in the tests, though. Looks like it has to do with the semaphore in the backend.
I guess I got really lucky with the 2x speedup, but #4013 does show a 25% speedup for some workloads.
I guess I got really lucky with the 2x speedup, but #4012 does show a 25% speedup for some workloads.
I think you might have referenced the wrong issue @greatroar
Thanks, that's #4013.
For me mount+cp was even 25x slower than restore (restic 0.15.2 on a medium-sized VPS running Ubuntu 22.04; I was backing up/restoring a 40GB git server, so there were tons of small files, and I'm using Google Drive to store the backup).
I don't really mind this - I understand that at last with mount there are few possibilities for big reliable speedups for this scenario, except maybe for massive caching.
However I think that the documentation should mention this - right now https://restic.readthedocs.io/en/latest/050_restore.html reads like using mount and restore is just a matter of personal preference - there should be a big warning that, if you want to restore lots of files (or even a whole snapshot), restore can be a lot faster.
And maybe restic mount could even print a small info, like "Note: If you want to restore lots of files, restic restore might much faster than copying from the mount."
@DanielGibson Would you be willing to create a PR to adjust the documentation?
sure, see https://github.com/restic/restic/pull/4345