gocryptfs icon indicating copy to clipboard operation
gocryptfs copied to clipboard

[Performance] Very slow performance when run gocryptfs over CIFS/SSHFS

Open zvirja opened this issue 1 year ago • 5 comments

Setup:

  • Debian 12
  • gocryptfs: v2.4.0.HEAD; go-fuse v2.3.0; 2023-07-04 go1.18.1 linux/amd64 (compiled myself)
  • sshfs: version 3.7.3

The encrypted dir is mounted via sshfs, gocryptfs is mounted over it.

Scenario:

  • I configured filebrowser via docker, mounted encrypted drive into it and am uploading a file.

The software is simply writing to a file in continuous mode (I verified that via strace, it constantly invokes write()). It's not dd, so chunks could be of different size and could arrive at random rate (as I'm uploading file via HTTP).

If I write directly to unencrypted mount, I get roughly 110 Mbit/s. If I write over gocryptfs I get around 15 MBit/s. That's quite a drop. CPU is low when running over gocryptfs, so it's IO bound.

I tried to investigate a bit and to do that I run sshfs with the following flags:

sshfs user@host:/dir -o LogLevel=DEBUG3,sshfs_debug /mnt/encrypted

That give me a log of SSH operations. Surprisingly when running over gocryptfs I observe the following log:

[42057] WRITE
[42058] FSTAT
  [42056]         STATUS       28bytes (11ms)
  [42057]         STATUS       28bytes (13ms)
  [42058]          ATTRS       41bytes (13ms)
[42059] READ
  [42059]           DATA     3477bytes (10ms)
[42060] WRITE
[42061] FSTAT
  [42060]         STATUS       28bytes (11ms)
  [42061]          ATTRS       41bytes (11ms)
[42062] WRITE
[42063] WRITE
[42064] FSTAT
debug2: channel 0: rcvd adjust 106901
  [42062]         STATUS       28bytes (12ms)
  [42063]         STATUS       28bytes (13ms)
  [42064]          ATTRS       41bytes (13ms)
[42065] READ
  [42065]           DATA     3733bytes (10ms)
[42066] WRITE
[42067] FSTAT
  [42066]         STATUS       28bytes (11ms)
  [42067]          ATTRS       41bytes (11ms)

When I write directly to unencrypted drive I see writes without reads:

  [03091]         STATUS       28bytes (437ms)
debug2: channel 0: rcvd adjust 16384
  [03092]         STATUS       28bytes (438ms)
debug2: channel 0: rcvd adjust 16384
debug2: channel 0: rcvd adjust 16384
  [03093]         STATUS       28bytes (440ms)
[03359] WRITE
[03360] WRITE
  [03094]         STATUS       28bytes (440ms)
debug2: channel 0: rcvd adjust 16384
[03361] WRITE
[03362] WRITE
  [03095]         STATUS       28bytes (437ms)
debug2: channel 0: rcvd adjust 16384
  [03096]         STATUS       28bytes (438ms)
debug2: channel 0: rcvd adjust 16384
  [03097]         STATUS       28bytes (437ms)
[03363] WRITE
[03364] WRITE
debug2: channel 0: rcvd adjust 16384
  [03098]         STATUS       28bytes (437ms)
debug2: channel 0: rcvd adjust 16384
  [03099]         STATUS       28bytes (433ms)

You can see immediate difference:

  • there are reads in addition to writes only
  • there are periodic FSTAT calls

I believe it is the reason of slowdown.

I tried to trace the gocryptfs itself in the middle of the upload (see full trace in the attachments; it's covering a period somewhere during the upload) and found a couple of interesting things:

[pid 59301] fallocate(15, FALLOC_FL_KEEP_SIZE, 241203186, 4128) = -1 EOPNOTSUPP (Operation not supported)
[pid 59301] pwrite64(15, "\352a7\16\377\360\2328\"\240\352=%6\2363\22\36,\222\275K\4\201X\361\27\257E\256n\351"..., 4128, 241203186) = 4128
[pid 59301] write(10, "\30\0\0\0\0\0\0\0\306\252\0\0\0\0\0\0\212\6\0\0\0\0\0\0", 24 <unfinished ...>
[pid 59499] <... read resumed>"\306y\0\0\20\0\0\0\310\252\0\0\0\0\0\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048784) = 31174
[pid 59301] <... write resumed>)        = 24
[pid 59499] pread64(15,  <unfinished ...>
[pid 59301] read(10,  <unfinished ...>
[pid 59297] <... sched_yield resumed>)  = 0
[pid 59297] futex(0x56376dc40f18, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 59297] nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
[pid 59297] futex(0x56376dc40ff8, FUTEX_WAIT_PRIVATE, 0, {tv_sec=58, tv_nsec=827137070} <unfinished ...>
[pid 59499] <... pread64 resumed>"", 4128, 241236210) = 0
[pid 59499] futex(0x56376dc40ff8, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 59297] <... futex resumed>)        = 0
[pid 59499] fallocate(15, FALLOC_FL_KEEP_SIZE, 241207314, 31350 <unfinished ...>
[pid 59297] sched_yield( <unfinished ...>
[pid 59499] <... fallocate resumed>)    = -1 EOPNOTSUPP (Operation not supported)
[pid 59499] pwrite64(15, "k+\3521vU\310\6B^$\266\216\373\207\321\35Jg\33\1\333\216\350\2\23b\236Z\275\231^"..., 31350, 241207314) = 31350
[pid 59499] write(10, "\30\0\0\0\0\0\0\0\310\252\0\0\0\0\0\0vy\0\0\0\0\0\0", 24) = 24
[pid 59299] <... read resumed>"D\0\0\0\26\0\0\0\312\252\0\0\0\0\0\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048784) = 68
[pid 59499] read(10,  <unfinished ...>
[pid 59299] dup(18)                     = 14
[pid 59299] lgetxattr("/proc/self/fd/14/NW99X1UeaP_4n0i5Nsegl8in-4KE22Lv8OTUiPkClq1oDNagsooGSehKzuj-DK0N", "user.gocryptfs.FzzKjXGO-TZ8Yilk3"..., 0xc001341000, 500) = -1 EOPNOTSUPP (Operation not supported)
[pid 59299] close(14) 
  • Firstly, it's indeed reading and writing from file handle 10.
  • Secondly, it indeed accesses file attributes (see lgetxattr)

Do you have any idea what is the cause of this? Why does it have to read while I stream write to a file? And more importantly, why is it reading those attributes regularly? Is there any config to tweak that behavior?

Thank you! Am happy to run another diagnostics if you have idea what to run 😊

Full log: strace-gocryptfs.txt

Thank you for your time and the awesome product!!

P.S. I did not run detailed analysis with CIFS, but the speed is comparable, so I suppose the issue is the same.

P.P.S. If I enable SSH compression sometimes I am able to avoid those reads after writes (I guess because the data is still in buffers). In that case I get the same speed as on the unencrypted directory. But that is not stable and after some time the speed drops as well.

zvirja avatar Jul 04 '23 14:07 zvirja

I know this is an old issue, but I just started using gocryptfs since a few days and notice the same behaviour.

There is a remote storage system mounted via sshfs on my computer:

sshfs -o reconnect -o ServerAliveInterval=15 user@server:/path/ local/crypt/

The connection is quite busy, the speed is just around 3 MB/s with pure sshfs (without gocryptfs). On top of sshfs is gocryptfs mounted:

gocryptfs -config local.config local/crypt/ local/plain/

Speed on this mount is much lower, just around 400 kB/s.

I am aware of these issues and I know about the option -noprealloc. Nonetheless, I was not able to get usable speed results when using gocryptfs over sshfs.

Any help is highly appreciated!

Storage: Debian 12 (OpenSSH_9.2p1 Debian-2+deb12u2, OpenSSL 3.0.11 19 Sep 2023) Client: Linux Mint with gocrypts 2.4.0 and sshfs 3.7.1 and FUSE 3.10.5

ontheair81 avatar Mar 06 '24 16:03 ontheair81

@ontheair81 what's your ping time to the storage? Testing right now I see, somewhat surprisingly, no difference between sshfs and gocryptfs-on-sshfs:

$ uname -a
Linux dell19 6.3.8-100.fc37.x86_64

$ ping -n nuetzlich.net
PING nuetzlich.net (144.76.19.111) 56(84) bytes of data.
64 bytes from 144.76.19.111: icmp_seq=1 ttl=57 time=25.4 ms
64 bytes from 144.76.19.111: icmp_seq=2 ttl=57 time=25.5 ms
64 bytes from 144.76.19.111: icmp_seq=3 ttl=57 time=24.9 ms
^C

$ for of in /dev/null sshfs/rand gocryptfs-on-sshfs/rand ; do echo "Writing to $of...." ; dd if=/dev/urandom bs=128k count=100 of=$of ; done

Writing to /dev/null....
100+0 records in
100+0 records out
13107200 bytes (13 MB, 12 MiB) copied, 0.0439322 s, 298 MB/s

Writing to sshfs/rand....
100+0 records in
100+0 records out
13107200 bytes (13 MB, 12 MiB) copied, 5.06013 s, 2.6 MB/s

Writing to gocryptfs-on-sshfs/rand....
100+0 records in
100+0 records out
13107200 bytes (13 MB, 12 MiB) copied, 5.09185 s, 2.6 MB/s

rfjakob avatar Mar 07 '24 09:03 rfjakob

I have to admit, that the problem seems not to be related to gocryptfs: I checked with another desktop computer in another location (and a standard DSL line), and there is pretty much no difference between sshfs and gocryptfs on top of sshfs.

So maybe there is something wrong with the initially used computer, or the problem is related to the type of connection there (MPLS network in a company). Or the ping is too bad (around 60 ms, to answer your question).

Sorry for the noise and thank you for this great software!

ontheair81 avatar Mar 07 '24 12:03 ontheair81

@rfjakob Do you have any thoughts around the analysis and reason why gocryptfs adds those constant lgetxattr calls (which later are translated to FSTAT SSH FS calls)? Those calls seem very redundant and could indeed degrade the performance, which is exactly what I observed. Of course the level of degradation depends on roundtrip time and throughput and of course if you have extremely low ping, you will not feel it much.

I would expect that gocryptfs shouldn't degrade performance unnecessarily, as better performance is always better than worse performance 😉

zvirja avatar Mar 11 '24 14:03 zvirja

I'm not sure where the getxattr calls are coming from. You cab maybr see that in the gocryptfs "-fusedebug" log. Maybe selinux, maybe acls.

But I think for performance they don't matter, as sshfs returns "operation not supported" an the call is not sent over the network.

The fstat calls are maybe a result of pwrite64() vs write().

On Mon, 11 Mar 2024, 15:24 Oleksandr Povar, @.***> wrote:

@rfjakob https://github.com/rfjakob Do you have any thoughts around the analysis and reason why gocryptfs adds those constant lgetxattr calls (which later are translated to FSTAT SSH FS calls)? Those calls seem very redundant and could indeed degrade the performance, which is exactly what I observed. Of course the level of degradation depends on roundtrip time and throughput and of course if you have extremely low ping, you will not feel it much.

I would expect that gocryptfs shouldn't degrade performance unnecessarily, as better performance is always better than worse performance 😉

— Reply to this email directly, view it on GitHub https://github.com/rfjakob/gocryptfs/issues/764#issuecomment-1988564817, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACGA74BSJTI4QE2RNZZYMDYXW5AFAVCNFSM6AAAAAAZ5Z3MUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBYGU3DIOBRG4 . You are receiving this because you were mentioned.Message ID: @.***>

rfjakob avatar Mar 11 '24 15:03 rfjakob