QAT_Engine
QAT_Engine copied to clipboard
AES GCM (128Bit and 256Bit) benefit with qat?
Trying to follow the instructions on this Intel reference page, and reproduce results reported by Intel with AES GCM having a (roughly) factor 2x over the baseline for 8kB blocks (without QAT) without success:
Is there any CPU or environment-related, i.e., openssl, setting missing? Or is it just the cipher no longer being implemented with/for QAT (as some other replies to similar issues in this repo hint to)?
I also tried creating a file /etc/sysconfig/qat
based on this other Intel QAT reference, with different settings, and no change was observed.
Environment: OS: RHEL 9.2 - Machine 01: qatengine rpm version 1.0.0-1.el9_2, CPU: Intel(R) Xeon(R) Platinum 8462Y+ - Machine 02: QAT_Engine built from this repo, CPU: Intel(R) Xeon(R) Platinum 8480+
QAT:
- Machine 01:
openssl engine -t -c -v qatengine
(qatengine) Reference implementation of QAT crypto engine(qat_hw) v1.0.0
[RSA, AES-128-CBC-HMAC-SHA256, AES-256-CBC-HMAC-SHA256, ChaCha20-Poly1305, SHA3-256, SHA3-384, SHA3-512]
[ available ]
ENABLE_EXTERNAL_POLLING, POLL, SET_INSTANCE_FOR_THREAD,
GET_NUM_OP_RETRIES, SET_MAX_RETRY_COUNT, SET_INTERNAL_POLL_INTERVAL,
GET_EXTERNAL_POLLING_FD, ENABLE_EVENT_DRIVEN_POLLING_MODE,
GET_NUM_CRYPTO_INSTANCES, DISABLE_EVENT_DRIVEN_POLLING_MODE,
SET_EPOLL_TIMEOUT, SET_CRYPTO_SMALL_PACKET_OFFLOAD_THRESHOLD,
ENABLE_INLINE_POLLING, ENABLE_HEURISTIC_POLLING,
GET_NUM_REQUESTS_IN_FLIGHT, INIT_ENGINE, SET_CONFIGURATION_SECTION_NAME,
ENABLE_SW_FALLBACK, HEARTBEAT_POLL, DISABLE_QAT_OFFLOAD, HW_ALGO_BITMAP
- Machine 02:
openssl engine -t -c -v qatengine
(qatengine) Reference implementation of QAT crypto engine(qat_hw) v1.2.0
[RSA, AES-128-CBC-HMAC-SHA256, AES-256-CBC-HMAC-SHA256, ChaCha20-Poly1305, SHA3-256, SHA3-384, SHA3-512, TLS1-PRF, X25519, X448]
[ available ]
ENABLE_EXTERNAL_POLLING, POLL, SET_INSTANCE_FOR_THREAD,
GET_NUM_OP_RETRIES, SET_MAX_RETRY_COUNT, SET_INTERNAL_POLL_INTERVAL,
GET_EXTERNAL_POLLING_FD, ENABLE_EVENT_DRIVEN_POLLING_MODE,
GET_NUM_CRYPTO_INSTANCES, DISABLE_EVENT_DRIVEN_POLLING_MODE,
SET_EPOLL_TIMEOUT, SET_CRYPTO_SMALL_PACKET_OFFLOAD_THRESHOLD,
ENABLE_INLINE_POLLING, ENABLE_HEURISTIC_POLLING,
GET_NUM_REQUESTS_IN_FLIGHT, INIT_ENGINE, SET_CONFIGURATION_SECTION_NAME,
ENABLE_SW_FALLBACK, HEARTBEAT_POLL, DISABLE_QAT_OFFLOAD, HW_ALGO_BITMAP
Openssl speed tests: - Machine 01:
taskset 0x1 openssl speed -evp aes-128-gcm
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-GCM 1031977.31k 2447358.19k 4960747.43k 6953347.07k 7698314.58k 7777314.93k
taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-GCM 1034648.31k 2460134.78k 4985995.95k 6963253.93k 7721680.57k 7777047.89k
Also obtained similar, i.e., same values, for aes-256-cbc
- Machine 02:
taskset 0x1 openssl speed -evp aes-128-gcm
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-GCM 943251.65k 2279441.22k 4627250.69k 6456699.12k 7145985.37k 7222886.40k
taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-GCM 949183.69k 2278880.92k 4637970.26k 6496159.08k 7153216.17k 7228517.03k
Also obtained similar, i.e., same values, for aes-256-cbc
Detailed output: - Machine 01:
taskset 0x1 openssl speed -evp aes-128-gcm
Doing AES-128-GCM for 3s on 16 size blocks: 192537514 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 64 size blocks: 115424354 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 58472809 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 20311076 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 8192 size blocks: 2818695 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 1424309 AES-128-GCM's in 2.99s
version: 3.0.7
built on: Wed Mar 8 00:00:00 2023 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-s
witches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redh
at-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -
fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wa,--noexecstack -Wa,--generate-missing-build-notes=yes
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -DOPENSSL_USE_NODELETE -DL_ENDIA
N -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DREDHAT_FIPS_VERSION="\"3
.0.7-1f5987c2732dd431\"" -DSYSTEM_CIPHERS_FILE="/etc/crypto-policies/back-ends/openssl.config"
CPUINFO: OPENSSL_ia32cap=0x7ffef3ffffebffff:0xfb417ffef3bfb7ef
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-GCM 1030301.08k 2462386.22k 4989679.70k 6932847.27k 7696916.48k 7804641.69k
taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm
Engine "qatengine" set.
Doing AES-128-GCM for 3s on 16 size blocks: 193996559 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 64 size blocks: 115318818 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 58429640 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 20400158 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 8192 size blocks: 2818338 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 16384 size blocks: 1424020 AES-128-GCM's in 3.00s
version: 3.0.7
built on: Wed Mar 8 00:00:00 2023 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-s
witches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redh·
at-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -
fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wa,--noexecstack -Wa,--generate-missing-build-notes=yes
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -DOPENSSL_USE_NODELETE -DL_ENDIA
N -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DREDHAT_FIPS_VERSION="\"3·
.0.7-1f5987c2732dd431\"" -DSYSTEM_CIPHERS_FILE="/etc/crypto-policies/back-ends/openssl.config" CPUINFO: OPENSSL_ia32cap=0x7ffef3ffffebffff:0xfb417ffef3bfb7ef
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-GCM 1034648.31k 2460134.78k 4985995.95k 6963253.93k 7721680.57k 7777047.89k
- Machine 02:
taskset 0x1 openssl speed -evp aes-128-gcm
Doing AES-128-GCM for 3s on 16 size blocks: 176270152 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 64 size blocks: 106848807 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 54225594 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 18853057 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 8192 size blocks: 2616938 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 1322550 AES-128-GCM's in 3.00s
version: 3.0.7
built on: Wed Mar 8 00:00:00 2023 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wa,--noexecstack -Wa,--generate-missing-build-notes=yes -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DREDHAT_FIPS_VERSION="\"3.0.7-1f5987c2732dd431\"" -DSYSTEM_CIPHERS_FILE="/etc/crypto-policies/back-ends/openssl.config"
CPUINFO: OPENSSL_ia32cap=0x7ffef3ffffebffff:0xfb417ffef3bfb7ef
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-GCM 943251.65k 2279441.22k 4627250.69k 6456699.12k 7145985.37k 7222886.40k
taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm
Engine "qatengine" set.
Doing AES-128-GCM for 3s on 16 size blocks: 177378703 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 64 size blocks: 106822543 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 54351214 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 18968277 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 8192 size blocks: 2619586 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 1323581 AES-128-GCM's in 3.00s
version: 3.0.7
built on: Wed Mar 8 00:00:00 2023 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wa,--noexecstack -Wa,--generate-missing-build-notes=yes -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DREDHAT_FIPS_VERSION="\"3.0.7-1f5987c2732dd431\"" -DSYSTEM_CIPHERS_FILE="/etc/crypto-policies/back-ends/openssl.config"
CPUINFO: OPENSSL_ia32cap=0x7ffef3ffffebffff:0xfb417ffef3bfb7ef
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
AES-128-GCM 949183.69k 2278880.92k 4637970.26k 6496159.08k 7153216.17k 7228517.03k
Other observations - testing with other ciphers (RSA2048): Test with 1) openssl speed software, 2) QAT synch, and 3) QAT asynch (-asynch_jobs 8) gives a speed up factor of (sign/s):
openssl speed rsa2048
sign verify sign/s verify/s
rsa 2048 bits 0.000237s 0.000014s 4212.3 72485.9
2x:
openssl speed -engine qatengine rsa2048
sign verify sign/s verify/s
rsa 2048 bits 0.000124s 0.000017s 8074.0 57805.8
~10x:
openssl speed -engine qatengine -async_jobs 8 rsa2048
sign verify sign/s verify/s
rsa 2048 bits 0.000024s 0.000005s 41191.2 206077.2