openzfs
openzfs copied to clipboard
Encrypted ZFS performance issues
System information
Type | Version/Name |
---|---|
Distribution Name | OSX |
Distribution Version | 10.13.4 |
Kernel | root:xnu-4570.51.1~1/RELEASE_X86_64 x86_64 |
Architecture | Intel |
ZFS Version | zfs-macOS-2.1.0-1 |
SPL Version | zfs-kmod-2.1.0-1 |
RAM | 64gb DDR4 2666 MHz |
CPU | Intel Core i7-8700 |
Summary of Problem
Poor read performance on encrypted ZFS dataset.
I have a 12 disk zraid2 zvol in a zpool. The disks I'm using are ST4000NM0033, Dell branded with Firmware GA6E.
I've created this using the following commands
sudo zpool create -f -o ashift=12 \
-O casesensitivity=insensitive \
-O normalization=formD \
-O compression=lz4 \
-O atime=off \
-O recordsize=256k \
ZfsMediaPool raidz2 \
/var/run/disk/by-path/PCI0@0-SAT0@17-PRT5@5-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-SAT0@17-PRT4@4-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-RP21@1B,4-PXSX@0-PRT31@1f-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-SAT0@17-PRT3@3-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-SAT0@17-PRT2@2-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-SAT0@17-PRT1@1-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-SAT0@17-PRT0@0-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-RP21@1B,4-PXSX@0-PRT2@2-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-RP21@1B,4-PXSX@0-PRT3@3-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-RP21@1B,4-PXSX@0-PRT28@1c-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-RP21@1B,4-PXSX@0-PRT4@4-PMP@0-@0:0 \
/var/run/disk/by-path/PCI0@0-RP21@1B,4-PXSX@0-PRT29@1d-PMP@0-@0:0
zpool add ZfsMediaPool log /dev/disk5s3
zpool add ZfsMediaPool cache /dev/disk5s4
zpool set feature@encryption=enabled ZfsMediaPool
zfs create -o encryption=on -o keylocation=prompt -o keyformat=passphrase ZfsMediaPool/bryan
zfs set com.apple.mimic_hfs=hfs ZfsMediaPool/bryan
Reading and writing from the non-encrypted dataset works as expected with speeds exceeding 500 mbytes/s.
When reading from the encrypted dataset the performance goes way down, and any random I/O will bring the system to a crawl. Writing to the dataset causes the CPU load to go to 600%. A program such as thunderbird is almost unusable on the encrypted ZFS dataset.
I've tested this against a ZFS dataset with an encrypted APFS container and it has performed much better than native ZFS encryption.
Describe how to reproduce the problem
I created a random 10gb file by concatenating 10 copies of 1g from /dev/urandom
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
ZfsMediaPool 8.18T 24.9T 7.07T /Volumes/ZfsMediaPool
ZfsMediaPool/bryan 1.08T 24.9T 1.08T /Users/bryan
ZfsMediaPool/mailcorestorage 31.6G 24.9T 31.6G -
My encrypted pool is mounted on /Users/bryan/ I did the following while logged in as root and no processes were accessing the dataset
I have the ZfsMediaPool/bryan dataset encrypted and mounted at /Users/bryan
/Volumes/ZfsMediaPool is the same zpool, but not encrypted
Below is a 10g random file being copied with no other processes using the pool
# this is writing it from the root, which is a NVME disk
dd if=/random-10g of=/Users/bryan/random-10g bs=1m
10737418240 bytes transferred in 71.818303 secs (149508103 bytes/sec)
# encrypted read
dd if=/Users/bryan/random-10g bs=1m of=/dev/null
10737418240 bytes transferred in 179.890006 secs (59688798 bytes/sec)
# non encrypted dataset
dd if=/Volumes/ZfsMediaPool/random-10g bs=1m of=/dev/null
10737418240 bytes transferred in 18.207343 secs (589730098 bytes/sec)
# First two are from the crypto disk, 1 is first, 2 is second but cache
$time du -hc /Users/bryan/Library/Thunderbird
real 0m3.202s
user 0m0.006s
sys 0m0.153s
$ time du -hc /Users/bryan/Library/Thunderbird
real 0m0.024s
user 0m0.003s
sys 0m0.020s
# This is from the non-crypto disk and not cached:
$ time du -hc /Users/bryan/Library/Thunderbird
real 0m0.552s
user 0m0.005s
sys 0m0.071s
I created an encrypted apfs disk on a zfs dataset
# zfs create -s -V 50g ZfsMediaPool/mailcorestorage
# ls /var/run/zfs/zvol/dsk/ZfsMediaPool/mailcorestorage -al
lrwxr-xr-x 1 root daemon 11 Oct 2 02:27 /var/run/zfs/zvol/dsk/ZfsMediaPool/mailcorestorage -> /dev/disk16
# diskutil eraseDisk JHFS+ dummy GPT /dev/disk16
(I was lazy and used diskutility to make an APFS encrypted disk)
copying from the same pool to the APFS dataset:
# dd if=/Volumes/ZfsMediaPool/random-10g of=/Volumes/encryptedMail/random-10g bs=1m
10240+0 records in
10240+0 records out
10737418240 bytes transferred in 83.464137 secs (128647089 bytes/sec)
Reading that file from the APFS dataset after remounting it:
dd if=/Users/bryan/Library/Thunderbird/random-10g of=/dev/null bs=1m
10240+0 records in
10240+0 records out
10737418240 bytes transferred in 34.071565 secs (315143090 bytes/sec)
Include any warning/errors/backtraces from the system logs
I've attached three spindumps during these operation and one while unison was running. zfs-read-from-crypto-dataset-Spindump.txt zfs-while-unison-running-Spindump.txt zfs-write-to-crypto-big-dataset-Spindump.txt
zpool layout:
ComicBookGuy:~ root# zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
ZfsMediaPool 43.7T 10.7T 32.9T - - 0% 24% 1.00x ONLINE -
raidz2 43.7T 10.7T 32.9T - - 0% 24.6% - ONLINE
PCI0@0-SAT0@17-PRT5@5-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-SAT0@17-PRT4@4-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-RP21@1B,4-PXSX@0-PRT31@1f-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-SAT0@17-PRT3@3-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-SAT0@17-PRT2@2-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-SAT0@17-PRT1@1-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-SAT0@17-PRT0@0-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-RP21@1B,4-PXSX@0-PRT2@2-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-RP21@1B,4-PXSX@0-PRT3@3-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-RP21@1B,4-PXSX@0-PRT28@1c-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-RP21@1B,4-PXSX@0-PRT4@4-PMP@0-@0:0 - - - - - - - - ONLINE
PCI0@0-RP21@1B,4-PXSX@0-PRT29@1d-PMP@0-@0:0 - - - - - - - - ONLINE
logs - - - - - - - - -
PCI0@0-RP09@1D-PXSX@0-IONVMeController-IONVMeBlockStorageDevice@1-@1:3 15.5G 1.19M 15.5G - - 0% 0.00% - ONLINE
cache - - - - - - - - -
PCI0@0-RP09@1D-PXSX@0-IONVMeController-IONVMeBlockStorageDevice@1-@1:4 128G 108G 20.1G - - 0% 84.3% - ONLINE
I have the same problem on the two Macs I tested, with the same version of ZFS as yours. One is an iMac with an i5-5675R on macOS 10.15.7 (kernel root:xnu-6153.141.43~1/RELEASE_X86_64
), the other is a MacBook Air with an i5-5350U on macOS 10.14.6 (kernel root:xnu-4903.278.70~1/RELEASE_X86_64
).
I tested on a RAM disk (see my script) with dd
from /dev/zero
, and performance on an unencrypted dataset is over 700 MB/s on the iMac, 208 MB/s on the MBA, while on AES-256-GCM (the default) it falls to about 150 MB/s on the iMac and 20 MB/s on the MBA with full CPU usage in both cases.
As a comparison, openssl speed -evp aes-256-gcm
gives 400-1500 MB/s on the iMac and 310-1200 MB/s on the MBA with only a single core saturated in both cases, and I tested on a RAM disk on a Linux VM on the iMac and it gives over 500 MB/s for AES-256-GCM, so it's clearly not a problem of OpenZFS. Maybe this release of OpenZFS on OSX doesn't use AES-NI?
Here are the results for the dd
benchmark:
iMac
Running benchmark for noencryption...
10+0 records in
10+0 records out
1048576000 bytes transferred in 1.427054 secs (734783724 bytes/sec)
Running benchmark for aes128ccm...
10+0 records in
10+0 records out
1048576000 bytes transferred in 2.966284 secs (353498177 bytes/sec)
Running benchmark for aes256ccm...
10+0 records in
10+0 records out
1048576000 bytes transferred in 3.687919 secs (284327275 bytes/sec)
Running benchmark for aes128gcm...
10+0 records in
10+0 records out
1048576000 bytes transferred in 6.448514 secs (162607386 bytes/sec)
Running benchmark for aes256gcm...
10+0 records in
10+0 records out
1048576000 bytes transferred in 6.668878 secs (157234244 bytes/sec)
MacBook Air
Running benchmark for noencryption...
10+0 records in
10+0 records out
1048576000 bytes transferred in 5.040728 secs (208020742 bytes/sec)
Running benchmark for aes128ccm...
10+0 records in
10+0 records out
1048576000 bytes transferred in 9.648026 secs (108682958 bytes/sec)
Running benchmark for aes256ccm...
10+0 records in
10+0 records out
1048576000 bytes transferred in 9.792953 secs (107074546 bytes/sec)
Running benchmark for aes128gcm...
10+0 records in
10+0 records out
1048576000 bytes transferred in 50.366285 secs (20819006 bytes/sec)
Running benchmark for aes256gcm...
10+0 records in
10+0 records out
1048576000 bytes transferred in 53.425810 secs (19626768 bytes/sec)
And here are the results of openssl speed -evp aes-256-gcm
:
iMac
LibreSSL 2.8.3
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-gcm 409696.18k 1008532.69k 1437244.00k 1539713.79k 1580495.42k
MacBook Air
LibreSSL 2.6.5
built on: date not available
options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx)
compiler: information not available
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-gcm 311740.97k 739539.17k 1127034.30k 1191862.55k 1227247.06k
1d90a79
We should now be detecting the correct CPUID features (yikes) and use AESNI when available. This could also be why AVX fails - need to test that separately.
There is a test PKG on the forum for those we can test it:
https://openzfsonosx.org/forum/viewtopic.php?f=26&t=3651&p=11593#p11588
Has this performance issue been fixed in version OpenZFSonOsX-2.1.0-Catalina-10.15?
I have not been able to get on to the openzfsonosx.org forum... site appears to be down.
The website is broken at the moment due to a problem with the server that runs the vm's. If you have any urgent enquires you might want to jump on #openzfs-macos on irc.libera.chat:6697
It's nothing I'd consider urgent... was going to try the test PKG linked above. Is there a way to get that PKG from github?