Calling "ls" on rclone/pcloud mount might download full file listing --> very slow!
I have tried to get a file listing of a borg repo mounted via rclone/pcloud. This takes very long and downloads a lot of chunks from the server. Interestingly, future file listings in the same repo are instantaneous, so it looks like all folder listings (but not the file contents) are downloaded. This wastes a lot of time and bandwidth. With big repos this leads to a situation where borg is not usable because it takes so long
It would be much better if only the tree (folder listing) of the requested data would be downloaded and other trees only on demand
Testing:
# [...] Set up pcloud remote (called "pcloud" here) in rclone....
# [...] Set up borg repo, I have repokey ChaCha20-Poly1305
export BORG_REPO=rclone:pcloud:Moritz/Backup/borg
export BORG_PASSPHRASE=#........
mkdir test
cd test
dd if=/dev/random of=test/file2 count=1000 bs=1k
dd if=/dev/random of=file2 count=1000 bs=1k
mkdir dir dir/dir2
dd if=/dev/random of=dir/file3 count=1000 bs=1k
dd if=/dev/random of=test/dir/dir2/file4 count=1000 bs=1000
borg create --progress test test
mkdir /tmp/pcloud
borg mount --debug --foreground /tmp/pcloud test
ls /tmp/pcloud/test # ... takes long
ls /tmp/pcloud/test/dir # instantaneous
ls /tmp/pcloud/test/dir/dir2 # instantaneous
# test if contents are also downloaded proactively --> no (as it takes time and prints debug output)
md5sum /tmp/pcloud/test/file1
Also, of the lines that are printed, the line Starting new HTTP connection (1): 127.0.0.1:40819 stays at the terminal bottom the longest, other lines are followed quickly by more output. Does this repeated HTTP connection buildup eat a lot of time as well?
The debug output of borg mount --debug ... is:
using builtin fallback logging configuration
11 self tests completed in 0.03 seconds
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 137
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/config/readme HTTP/1.1" 200 73
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 138
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/config/version HTTP/1.1" 200 1
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 129
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/config/id HTTP/1.1" 200 64
LOCK-INIT: initialising. store: <Store(url='rclone:pcloud:Moritz/Backup/borg', levels=[('archives/', [0]), ('config/', [0]), ('cache/', [0]), ('locks/', [0]), ('data/', [2]), ('keys/', [0])])>, stale: 1800s, refresh: 900s.
LOCK-ACQUIRE: trying to acquire a lock. exclusive: False.
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/list HTTP/1.1" 200 16
LOCK-CREATE: creating lock in store. key: a771f408648f80c8, lock: {'exclusive': False, 'hostid': 'moes@181083354778154', 'processid': 105405, 'threadid': 0, 'time': '2025-04-01T16:04:44.120+00:00'}.
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 18
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/uploadfile?fs=pcloud%3AMoritz%2FBackup%2Fborg%2F&remote=locks HTTP/1.1" 200 3
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/list HTTP/1.1" 200 171
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 157
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/locks/a771f408648f80c8 HTTP/1.1" 200 131
LOCK-ACQUIRE: success! no exclusive locks detected.
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 142
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/config/manifest HTTP/1.1" 200 439
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 138
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/keys/repokey HTTP/1.1" 200 486
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 138
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/keys/repokey HTTP/1.1" 200 486
security: read previous location 'rclone:pcloud:Moritz/Backup/borg'
security: read manifest timestamp '2025-04-01T16:04:26.727148+00:00'
security: determined newest manifest timestamp as 2025-04-01T16:04:26.727148+00:00
security: repository checks ok, allowing access
mount data cache capacity: 8 chunks
Mounting filesystem
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/list HTTP/1.1" 200 1523
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/04/cb/04cbc2a70499172eed86a02ea3171e82d76b63312d4d538033303cc3d13db700 HTTP/1.1" 200 485
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/5d/c3/5dc331316ce83b5fd785237978ee485f216acb615ac0bb945c6ae9f6972e17d4 HTTP/1.1" 200 444
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/64/11/6411db2cf74580a8f5aae72f8649058b65adc2fd3f9396bea32309b8f528a6b9 HTTP/1.1" 200 445
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/93/8c/938cd195ef696b002b1a2bcd0c7712de27e3316719d5c7643ddd840884895a92 HTTP/1.1" 200 423
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/e6/d9/e6d9c7233b6659b041baf347a8ab1e165b9054530c76ffa6a017ba8bf14b5d81 HTTP/1.1" 200 497
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/ef/8f/ef8f4ff62e85865458f096e7ea30417fa24dfb66212bfef9ba567d3e93dcaefa HTTP/1.1" 200 496
Initializing pyfuse3
Calling fuse_session_new
Calling fuse_session_mount
pyfuse-02: No tasks waiting, starting another worker (now 2 total).
########## ----- ls /tmp/test/test ------- #####################################
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/list HTTP/1.1" 200 1523
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/64/11/6411db2cf74580a8f5aae72f8649058b65adc2fd3f9396bea32309b8f528a6b9 HTTP/1.1" 200 445
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/64/11/6411db2cf74580a8f5aae72f8649058b65adc2fd3f9396bea32309b8f528a6b9 HTTP/1.1" 200 445
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/95/5c/955c56bb70275cff07e1bc6e7dd0022a6f5be68f0fe12e71d264862675baf0cc HTTP/1.1" 200 192
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/f1/ed/f1ed96b32be5bb5ae350cb89273296e57c5224b17a5dc158731dc20d56aa4551 HTTP/1.1" 200 760
fuse: _process_archive completed in 2.0 s for archive test
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 258
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/f1/ed/f1ed96b32be5bb5ae350cb89273296e57c5224b17a5dc158731dc20d56aa4551 HTTP/1.1" 200 760
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 261
######## ls /tmp/test/dir and ls /tmp/dir/dir2 cause no output #########################
######### md5sum /tmp/pcloud/test/file1 ########################################
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/17/95/17952d134c76c37f98de0a861e11e468df74d9724eefa16cd1df6e9f3c307557 HTTP/1.1" 200 849102
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "POST /operations/stat HTTP/1.1" 200 261
Starting new HTTP connection (1): 127.0.0.1:40819
http://127.0.0.1:40819 "GET /%5Bpcloud:Moritz/Backup/borg/%5D/data/27/7a/277ad6c70fefac77fb1c06cdfcd8d609bfe276e981c0d5720d7b671fc2ba246a HTTP/1.1" 200 151228
^Cpyfuse-02: terminated
borg stores all file metadata in the archive stream (similar to e.g. tar, just no file content). there are no isolated "tree objects".
when first accessing an archive directory inside a borg mount REPO /mountpoint, the complete archive will be read and the fs structure below that directory will be built.
additionally, as you use borg2, all chunks are stored as separate repository objects, so accessing many small chunks in a high latency store is slow.
I see... so that means reading the full file system is the intended behaviour of borg2 or is mandated by FUSE? Or is it in theory possible to change this to improve performance with rclone remotes?
the archive stream is a linear, unindexed thing, so there is no way to read it in a non-linear way.
but, borg mount supports to partially build the fs (e.g. to save memory), but it will still read the archive stream completely (but only putting matching items into the built fs).
that's not mandated by fuse, but doing it differently would need a fundamental change to what a borg archive is.
some of the slowness is likely not due to that, but due to latency adding up when accessing many small objects in the repo.
Ok, this sounds as in the current state the rclone compatibility of borg2 is more theoretical (at least for cloud remotes) because the overhead of small file reading makes it infeasible to read and therefore check the archive? (Would borg check --max-duration even work if all the alotted time is spent only to read in the file structure?) Is there any potential contribution (that I could try to provide) that would reduce the latency or is this just not the direction this project is aiming at?
There is #8572 still open.