QAT_Engine icon indicating copy to clipboard operation
QAT_Engine copied to clipboard

AES GCM (128Bit and 256Bit) benefit with qat?

Open sferlin opened this issue 1 year ago • 1 comments

Trying to follow the instructions on this Intel reference page, and reproduce results reported by Intel with AES GCM having a (roughly) factor 2x over the baseline for 8kB blocks (without QAT) without success:

Is there any CPU or environment-related, i.e., openssl, setting missing? Or is it just the cipher no longer being implemented with/for QAT (as some other replies to similar issues in this repo hint to)?

I also tried creating a file /etc/sysconfig/qat based on this other Intel QAT reference, with different settings, and no change was observed.

Environment: OS: RHEL 9.2 - Machine 01: qatengine rpm version 1.0.0-1.el9_2, CPU: Intel(R) Xeon(R) Platinum 8462Y+ - Machine 02: QAT_Engine built from this repo, CPU: Intel(R) Xeon(R) Platinum 8480+

QAT: - Machine 01: openssl engine -t -c -v qatengine (qatengine) Reference implementation of QAT crypto engine(qat_hw) v1.0.0
[RSA, AES-128-CBC-HMAC-SHA256, AES-256-CBC-HMAC-SHA256, ChaCha20-Poly1305, SHA3-256, SHA3-384, SHA3-512]
[ available ] ENABLE_EXTERNAL_POLLING, POLL, SET_INSTANCE_FOR_THREAD,
GET_NUM_OP_RETRIES, SET_MAX_RETRY_COUNT, SET_INTERNAL_POLL_INTERVAL,
GET_EXTERNAL_POLLING_FD, ENABLE_EVENT_DRIVEN_POLLING_MODE,
GET_NUM_CRYPTO_INSTANCES, DISABLE_EVENT_DRIVEN_POLLING_MODE,
SET_EPOLL_TIMEOUT, SET_CRYPTO_SMALL_PACKET_OFFLOAD_THRESHOLD,
ENABLE_INLINE_POLLING, ENABLE_HEURISTIC_POLLING,
GET_NUM_REQUESTS_IN_FLIGHT, INIT_ENGINE, SET_CONFIGURATION_SECTION_NAME,
ENABLE_SW_FALLBACK, HEARTBEAT_POLL, DISABLE_QAT_OFFLOAD, HW_ALGO_BITMAP - Machine 02: openssl engine -t -c -v qatengine (qatengine) Reference implementation of QAT crypto engine(qat_hw) v1.2.0 [RSA, AES-128-CBC-HMAC-SHA256, AES-256-CBC-HMAC-SHA256, ChaCha20-Poly1305, SHA3-256, SHA3-384, SHA3-512, TLS1-PRF, X25519, X448] [ available ] ENABLE_EXTERNAL_POLLING, POLL, SET_INSTANCE_FOR_THREAD, GET_NUM_OP_RETRIES, SET_MAX_RETRY_COUNT, SET_INTERNAL_POLL_INTERVAL, GET_EXTERNAL_POLLING_FD, ENABLE_EVENT_DRIVEN_POLLING_MODE, GET_NUM_CRYPTO_INSTANCES, DISABLE_EVENT_DRIVEN_POLLING_MODE, SET_EPOLL_TIMEOUT, SET_CRYPTO_SMALL_PACKET_OFFLOAD_THRESHOLD, ENABLE_INLINE_POLLING, ENABLE_HEURISTIC_POLLING, GET_NUM_REQUESTS_IN_FLIGHT, INIT_ENGINE, SET_CONFIGURATION_SECTION_NAME, ENABLE_SW_FALLBACK, HEARTBEAT_POLL, DISABLE_QAT_OFFLOAD, HW_ALGO_BITMAP

Openssl speed tests: - Machine 01:

taskset 0x1 openssl speed -evp aes-128-gcm 
type           16 bytes     64 bytes      256 bytes   1024 bytes   8192 bytes   16384 bytes                                   
AES-128-GCM    1031977.31k  2447358.19k  4960747.43k  6953347.07k  7698314.58k  7777314.93k 
taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm                                           
type           16 bytes     64 bytes      256 bytes   1024 bytes   8192 bytes   16384 bytes                                   
AES-128-GCM    1034648.31k  2460134.78k  4985995.95k  6963253.93k  7721680.57k  7777047.89k

Also obtained similar, i.e., same values, for aes-256-cbc

- Machine 02:

taskset 0x1 openssl speed -evp aes-128-gcm 
type           16 bytes     64 bytes      256 bytes   1024 bytes   8192 bytes   16384 bytes                                   
AES-128-GCM     943251.65k  2279441.22k  4627250.69k  6456699.12k  7145985.37k  7222886.40k
taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm     
type           16 bytes     64 bytes      256 bytes   1024 bytes   8192 bytes   16384 bytes                                   
AES-128-GCM     949183.69k  2278880.92k  4637970.26k  6496159.08k  7153216.17k  7228517.03k

Also obtained similar, i.e., same values, for aes-256-cbc

Detailed output: - Machine 01:

taskset 0x1 openssl speed -evp aes-128-gcm 
Doing AES-128-GCM for 3s on 16 size blocks: 192537514 AES-128-GCM's in 2.99s                                                 
Doing AES-128-GCM for 3s on 64 size blocks: 115424354 AES-128-GCM's in 3.00s                                                 
Doing AES-128-GCM for 3s on 256 size blocks: 58472809 AES-128-GCM's in 3.00s                                                 
Doing AES-128-GCM for 3s on 1024 size blocks: 20311076 AES-128-GCM's in 3.00s                                                
Doing AES-128-GCM for 3s on 8192 size blocks: 2818695 AES-128-GCM's in 3.00s                                                 
Doing AES-128-GCM for 3s on 16384 size blocks: 1424309 AES-128-GCM's in 2.99s                                                
version: 3.0.7
built on: Wed Mar  8 00:00:00 2023 UTC 
options: bn(64,64)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-s
witches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redh
at-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -
fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wa,--noexecstack -Wa,--generate-missing-build-notes=yes
 -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -DOPENSSL_USE_NODELETE -DL_ENDIA
N -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DREDHAT_FIPS_VERSION="\"3
.0.7-1f5987c2732dd431\"" -DSYSTEM_CIPHERS_FILE="/etc/crypto-policies/back-ends/openssl.config"                               
CPUINFO: OPENSSL_ia32cap=0x7ffef3ffffebffff:0xfb417ffef3bfb7ef                                                               
The 'numbers' are in 1000s of bytes per second processed.                                                                    
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes                                   
AES-128-GCM    1030301.08k  2462386.22k  4989679.70k  6932847.27k  7696916.48k  7804641.69k 
taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm                                           
Engine "qatengine" set.                   
Doing AES-128-GCM for 3s on 16 size blocks: 193996559 AES-128-GCM's in 3.00s                                                 
Doing AES-128-GCM for 3s on 64 size blocks: 115318818 AES-128-GCM's in 3.00s                                                 
Doing AES-128-GCM for 3s on 256 size blocks: 58429640 AES-128-GCM's in 3.00s                                                 
Doing AES-128-GCM for 3s on 1024 size blocks: 20400158 AES-128-GCM's in 3.00s                                                
Doing AES-128-GCM for 3s on 8192 size blocks: 2818338 AES-128-GCM's in 2.99s                                                 
Doing AES-128-GCM for 3s on 16384 size blocks: 1424020 AES-128-GCM's in 3.00s                                                
version: 3.0.7                                                                                                               
built on: Wed Mar  8 00:00:00 2023 UTC                                                                                       
options: bn(64,64)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-s
witches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redh·
at-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -
fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wa,--noexecstack -Wa,--generate-missing-build-notes=yes
 -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -DOPENSSL_USE_NODELETE -DL_ENDIA
N -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DREDHAT_FIPS_VERSION="\"3·
.0.7-1f5987c2732dd431\"" -DSYSTEM_CIPHERS_FILE="/etc/crypto-policies/back-ends/openssl.config"                               CPUINFO: OPENSSL_ia32cap=0x7ffef3ffffebffff:0xfb417ffef3bfb7ef                                                               
The 'numbers' are in 1000s of bytes per second processed.                                                                    
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes                                   
AES-128-GCM    1034648.31k  2460134.78k  4985995.95k  6963253.93k  7721680.57k  7777047.89k  

- Machine 02:

taskset 0x1 openssl speed -evp aes-128-gcm
Doing AES-128-GCM for 3s on 16 size blocks: 176270152 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 64 size blocks: 106848807 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 54225594 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 18853057 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 8192 size blocks: 2616938 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 1322550 AES-128-GCM's in 3.00s
version: 3.0.7
built on: Wed Mar  8 00:00:00 2023 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wa,--noexecstack -Wa,--generate-missing-build-notes=yes -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DREDHAT_FIPS_VERSION="\"3.0.7-1f5987c2732dd431\"" -DSYSTEM_CIPHERS_FILE="/etc/crypto-policies/back-ends/openssl.config"
CPUINFO: OPENSSL_ia32cap=0x7ffef3ffffebffff:0xfb417ffef3bfb7ef
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-128-GCM     943251.65k  2279441.22k  4627250.69k  6456699.12k  7145985.37k  7222886.40k
taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm
Engine "qatengine" set.
Doing AES-128-GCM for 3s on 16 size blocks: 177378703 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 64 size blocks: 106822543 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 54351214 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 18968277 AES-128-GCM's in 2.99s
Doing AES-128-GCM for 3s on 8192 size blocks: 2619586 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 1323581 AES-128-GCM's in 3.00s
version: 3.0.7
built on: Wed Mar  8 00:00:00 2023 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -Wa,--noexecstack -Wa,--generate-missing-build-notes=yes -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DZLIB -DNDEBUG -DPURIFY -DDEVRANDOM="\"/dev/urandom\"" -DREDHAT_FIPS_VERSION="\"3.0.7-1f5987c2732dd431\"" -DSYSTEM_CIPHERS_FILE="/etc/crypto-policies/back-ends/openssl.config"
CPUINFO: OPENSSL_ia32cap=0x7ffef3ffffebffff:0xfb417ffef3bfb7ef
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-128-GCM     949183.69k  2278880.92k  4637970.26k  6496159.08k  7153216.17k  7228517.03k

Other observations - testing with other ciphers (RSA2048): Test with 1) openssl speed software, 2) QAT synch, and 3) QAT asynch (-asynch_jobs 8) gives a speed up factor of (sign/s):

openssl speed rsa2048
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000237s 0.000014s   4212.3  72485.9 

2x:

openssl speed -engine qatengine rsa2048
                  sign    verify    sign/s verify/s 
rsa 2048 bits 0.000124s 0.000017s   8074.0  57805.8 

~10x:

openssl speed -engine qatengine -async_jobs 8 rsa2048
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000024s 0.000005s  41191.2 206077.2

sferlin avatar Jul 28 '23 12:07 sferlin