Unable to run asm code on OpenBSD (arm64)
In order to get asm code running on OpenBSD we must place all constants into .rodata sections.
closes #23312
Is the update to cloudflare-quiche needed as part of this?
Note, when building on Apple Silicon I get
m2 $ perl Configure --banner="Done" darwin64-arm64 no-shared no-threads no-async -Wimplicit-function-declaration
Configuring OpenSSL version 3.4.0-dev for target darwin64-arm64
Using os-specific seed configuration
Created configdata.pm
Running configdata.pm
Created Makefile.in
Created Makefile
Done
m2 $ make
perl util/mkinstallvars.pl PREFIX=. BINDIR=apps LIBDIR= INCLUDEDIR=include APPLINKDIR=ms ENGINESDIR=engines MODULESDIR=providers "VERSION=3.4.0-dev" "LDLIBS= " > builddata.pm
perl "-I." "-Iutil/perl" "-Mconfigdata" "-MOpenSSL::paramnames" "util/dofile.pl" "-oMakefile" crypto/params_idx.c.in > crypto/params_idx.c
perl util/mkinstallvars.pl "PREFIX=/usr/local" BINDIR=bin "LIBDIR=lib" INCLUDEDIR=include APPLINKDIR=include/openssl "ENGINESDIR=/usr/local/lib/engines-3" "MODULESDIR=/usr/local/lib/ossl-modules" "PKGCONFIGDIR=/usr/local/lib/pkgconfig" "CMAKECONFIGDIR=/usr/local/lib/cmake/OpenSSL" "LDLIBS= " "VERSION=3.4.0-dev" > installdata.pm
perl "-I." "-Mconfigdata" "util/dofile.pl" "-oMakefile" include/crypto/bn_conf.h.in > include/crypto/bn_conf.h
:
:
cc -Icrypto -I. -Iinclude -Iproviders/common/include -Iproviders/implementations/include -DBSAES_ASM -DECP_NISTZ256_ASM -DECP_SM2P256_ASM -DKECCAK1600_ASM -DMD5_ASM -DOPENSSL_BN_ASM_MONT -DOPENSSL_CPUID_OBJ -DOPENSSL_SM3_ASM -DPOLY1305_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DSM4_ASM -DVPAES_ASM -DVPSM4_ASM -fPIC -arch arm64 -O3 -Wall -Wimplicit-function-declaration -DL_ENDIAN -DOPENSSL_PIC -DOPENSSLDIR="\"/usr/local/ssl\"" -DENGINESDIR="\"/usr/local/lib/engines-3\"" -DMODULESDIR="\"/usr/local/lib/ossl-modules\"" -DOPENSSL_BUILDING_OPENSSL -DNDEBUG -c -o crypto/aes/libcrypto-lib-aesv8-armx.o crypto/aes/aesv8-armx.S
/var/folders/45/s6l2x9xn1vndxk4syvrgrfxh0000gn/T/aesv8-armx-361efa.s:41:2: error: unknown AArch64 fixup kind!
add x3,x3,:lo12:Lrcon
^
make[1]: *** [crypto/aes/libcrypto-lib-aesv8-armx.o] Error 1
make: *** [build_sw] Error 2
Note, when building on Apple Silicon I get
m2 $ perl Configure --banner="Done" darwin64-arm64 no-shared no-threads no-async -Wimplicit-function-declaration Configuring OpenSSL version 3.4.0-dev for target darwin64-arm64 Using os-specific seed configuration Created configdata.pm Running configdata.pm Created Makefile.in Created Makefile Done m2 $ make perl util/mkinstallvars.pl PREFIX=. BINDIR=apps LIBDIR= INCLUDEDIR=include APPLINKDIR=ms ENGINESDIR=engines MODULESDIR=providers "VERSION=3.4.0-dev" "LDLIBS= " > builddata.pm perl "-I." "-Iutil/perl" "-Mconfigdata" "-MOpenSSL::paramnames" "util/dofile.pl" "-oMakefile" crypto/params_idx.c.in > crypto/params_idx.c perl util/mkinstallvars.pl "PREFIX=/usr/local" BINDIR=bin "LIBDIR=lib" INCLUDEDIR=include APPLINKDIR=include/openssl "ENGINESDIR=/usr/local/lib/engines-3" "MODULESDIR=/usr/local/lib/ossl-modules" "PKGCONFIGDIR=/usr/local/lib/pkgconfig" "CMAKECONFIGDIR=/usr/local/lib/cmake/OpenSSL" "LDLIBS= " "VERSION=3.4.0-dev" > installdata.pm perl "-I." "-Mconfigdata" "util/dofile.pl" "-oMakefile" include/crypto/bn_conf.h.in > include/crypto/bn_conf.h : : cc -Icrypto -I. -Iinclude -Iproviders/common/include -Iproviders/implementations/include -DBSAES_ASM -DECP_NISTZ256_ASM -DECP_SM2P256_ASM -DKECCAK1600_ASM -DMD5_ASM -DOPENSSL_BN_ASM_MONT -DOPENSSL_CPUID_OBJ -DOPENSSL_SM3_ASM -DPOLY1305_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DSM4_ASM -DVPAES_ASM -DVPSM4_ASM -fPIC -arch arm64 -O3 -Wall -Wimplicit-function-declaration -DL_ENDIAN -DOPENSSL_PIC -DOPENSSLDIR="\"/usr/local/ssl\"" -DENGINESDIR="\"/usr/local/lib/engines-3\"" -DMODULESDIR="\"/usr/local/lib/ossl-modules\"" -DOPENSSL_BUILDING_OPENSSL -DNDEBUG -c -o crypto/aes/libcrypto-lib-aesv8-armx.o crypto/aes/aesv8-armx.S /var/folders/45/s6l2x9xn1vndxk4syvrgrfxh0000gn/T/aesv8-armx-361efa.s:41:2: error: unknown AArch64 fixup kind! add x3,x3,:lo12:Lrcon ^ make[1]: *** [crypto/aes/libcrypto-lib-aesv8-armx.o] Error 1 make: *** [build_sw] Error 2
The clouflare-quiche is accidental change. mostlikely caused by my poor git skills.
Took a look at CI-results and see the failure on Mac-OS too. will investigate and update PR with fix. please stay tuned.
EDIT: just noticed error on cross-compilation, will take a look and ping you again
@tom-cosgrove-arm please, now it's a good time for review. all checks are green (I think). The list of changes since last time:
- the accidental addition of cloudflare-quiche is gone from PR (thanks for spotting this).
- all changes come from @botovq except a small tweak to
arm-xlate.pl - change to
arm-xlate.plmakes macos-14 happy as well as arm linux builds those changes do work for OpenSSL running on OpenBSD as far as I can tell.
Hope to see this PR be merged soon!
@tom-cosgrove-arm could you please look?
the file crypto/aes/asm/aesv8-armx.pl is also used to build armv7 (32-bits)
the adrp instruction we use to deal with address relocation on 64-bit CPU
does not work on 32-bit armv7 (compilation fails)
According to 9.4.2.4 there are two ways to deal with relocation on 32-bit arm. In this particular case of aesv8-armx we can use the first variant like that:
movw $ptr, #:lower16:.Lrcon
movt $ptr, #:upper16:.Lrcon
Unfortunately it fails to link. The linker fails with error as follows:
Run make -s -j4
make -s -j4
shell: /usr/bin/bash -e {0}
/usr/lib/gcc-cross/arm-linux-gnueabihf/11/../../../../arm-linux-gnueabihf/bin/ld: crypto/aes/libcrypto-shlib-aesv8-armx.o: relocation R_ARM_THM_MOVW_ABS_NC against `a local symbol' can not be used when making a shared object; recompile with -fPIC
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:6464: libcrypto.so.3] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:3826: build_sw] Error 2
Error: Process completed with exit code 2.
The error is odd. Because if I understand the generated Makefile right the command which compiles the module reads as follows (snippet comes from Makefile):
3713 CFLAGS=-Wall -O3 -DPEDANTIC -pedantic -Wno-long-long -DUNUSEDRESULT_DEBUG -Wall -Wmissing-declarations -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wswitch -Wsign-compare -Wshadow -Wformat -Wno-type-limits -Wno-tautological-constant-out-of-range-compare -Wundef -Werror -Wmissing-prototypes - Wstrict-prototypes
....
3743 CNF_CFLAGS=-pthread -Wa,--noexecstack
....
3755 LIB_CFLAGS=-fPIC $(CNF_CFLAGS) $(CFLAGS)
....
7258 crypto/aes/libcrypto-shlib-aesv8-armx.o: crypto/aes/aesv8-armx.S
7259 $(CC) -Icrypto -I. -Iinclude -Iproviders/common/include -Iproviders/implementations/include -DAES_ASM -DBSAES_ASM -DECP_NISTZ256_ASM -DGHASH_ASM -DKECCAK1600_ASM -DOPENSSL_BN_ASM_GF2m -DOPENSSL_BN_ASM_MONT -DOPENSSL_CPUID_OBJ -DPOLY1305_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM $(LIB_CFLAGS) $(LIB_CPPFLAGS) -c -o $@ crypto/aes/aesv8-armx.S
7260 crypto/aes/aesv8-armx.S: crypto/aes/asm/aesv8-armx.pl
CC="$(CC)" $(PERL) crypto/aes/asm/aesv8-armx.pl "$(PERLASM_SCHEME)" -Icrypto -I. -Iinclude -Iproviders/common/include -Iproviders/implementations/include $(LIB_CFLAGS) $(LIB_CPPFLAGS) -DAES_ASM -DBSAES_ASM -DECP_NISTZ256_ASM -DGHASH_ASM -DKECCAK1600_ASM -DOPENSSL_BN_ASM_GF2m -DOPENSSL_BN_ASM_MONT -DOPENSSL_CPUID_OBJ -DPOLY1305_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM $(PROCESSOR) $@
I'm not able to get the whole build log to exactly see how
libcrypto-shlib-aesv8-armx.o got produced by the compiler. However the Makefile
I got from artifacts is straightforward. I do believe -fPIC option is passed
to the compiler. The linker error does not make sense to me.
The other alternative mentioned in 9.4.2.4 reads as follows:
movs $ptr, #:upper8_15:#.Lrcon
lsls $ptr, $ptr, #8
adds $ptr, #:upper0_7:#.Lrcon
lsls $ptr, $ptr, #8
adds $ptr, #:lower8_15:#.Lrcon
lsls $ptr, $ptr, #8
adds $ptr, #:lower0_7:#.Lrcon
The build has finished successfully. but tests did fail with error as follows:
SHA3 : (KAT_Digest) : Pass
AES_GCM : (KAT_Cipher) : qemu: uncaught target signal 11 (Segmentation fault) - core dumped
../../util/wrap.pl ../../apps/openssl fipsinstall -pedantic -module ../../providers/fips.so -provider_name fips -section_name fips_sect -out ../../test/fipsmodule.cnf => 139
not ok 1 - fips install
# ------------------------------------------------------------------------------
# Failed test 'fips install'
# at test/recipes/00-prep_fipsmodule_cnf.t line 33.
# Looks like you failed 1 test of 1.00-prep_fipsmodule_cnf.t ..
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/1 subtests
It looks like we die in AES module. I suspect we touch an invalid address. Like the position of .Lrcon constant is miscalculated.
Hitting those corners I've decided to opt for solution no to let aesv8-armx.pl generate .rodata section for 32 bit flavour.
The error is odd. [...] I do believe
-fPICoption is passed to the compiler. The linker error does not make sense to me.
-fPIC will change the kinds of instructions the compiler emits. Since we're dealing with handwritten assembly, the choice of instructions are already burned in, so there's not much the flag can do to change anything. We have to write the assembly in a PIC-compatible form to begin with.
The pattern you've used here loads the absolute address of .Lrcon directly into $ptr. That is, the address itself (split over two instructions) is used as an operand in the code. That's not a position-independent pattern. When the assembler processes that, it emits a relocation because the address is not known yet.
Then, when you go to link it, because you're making a position-independent output (shared library), the linker also cannot resolve the relocation. It would need to be deferred to the loader, which actually knows the address the overall shared library is mapped to.
The linker is then rejecting this because shared libraries aren't supposed to have runtime .text relocations, so that most pages don't need to be dirtied and thus unshared.
What you need is a PC-relative address. Although the assembler syntaxes are similar, aarch64's ADRP + ADD/LDR pattern adds PC to the operand, so what's stored is the delta between the reference and the symbol. That's also a relocation because the assembler doesn't know the distance between .text and .rodata; in the end, all the .text sections are put together and all the .rodata sections are put together. But that relocation is allowed because it's resolved at link time. No matter where you load the shared library, the delta between .text and .rodata is fixed, which in turn fixes the delta between symbol and reference.
In x86_64, this is fairly straightforward, at least within the same shared library , because you just write whatever(%rip) and emit a larger instruction with enough room for this delta. (Though if you've heard of larger code models, those are exactly to account for getting more room in these deltas.) Other, more RISC-y, ISAs are more complex because you can't just make the instruction big enough to fit the delta. That's why aarch64 has the ADRP + ADD/LDR split; a single instruction doesn't have enough room to encode the delta.
Why does this come up when you move from .text to .rodata? It's purely a size thing. When the assembler references symbols from the same section in the same file, the distance is very small. In fact the distance is totally determined by the assembly file and independent of what else you link in, so it's totally predictable. But as soon as you cross sections, it's no longer predictable. It can get as large as the whole of libcrypto.so, or the entire calling application if it links OpenSSL statically.
Even in the single-instruction forms, 32-bit Arm is very complex. You've got both Arm mode and Thumb mode, both of which have very different forms of the relevant instructions, and impact the size of other instructions. Some of the forms use a very tricky immediate encoding where, e.g., you can encode larger multiples of 4 than you can encode odd numbers. There are some files in OpenSSL Arm assembly that are at the very very edge of fitting, such that if you added or removed an instruction, it's less aligned and no longer fits! (Being so tight is only viable because the distance is totally fixed by being a same-section + same-file reference.)
Hitting those corners I've decided to opt for solution no to let
aesv8-armx.plgenerate.rodatasection for32bit flavour.
That's what we did too. We don't attempt to support XOM on 32-bit Arm in BoringSSL. There are much, much fewer patterns available to do this on 32-bit Arm. IIRC the available relocations also differ a lot by platform. (The way OpenSSL does the PC-relative addressing on 32-bit iOS is quite hairy. Thankfully, 32-bit iOS no longer exists.) If you look at compiler output, it usually establishes constant-banks near the functions... which requires reading from .text anyway. Unless some platform comes up that actually needs it (and new platforms ought to move to aarch64 anyway), I think we can ignore this.
@tom-cosgrove-arm could you please look?
Yes, will do; the other part of my job is very busy ATM
Sorry to bother,when do you plan to review this PR? @tom-cosgrove-arm
@tom-cosgrove-arm could you please look?
This PR is in a state where it requires action by @openssl/committers but the last update was 30 days ago
@tom-cosgrove-arm If you're really busy,maybe you can assign someone else to review?
@tom-cosgrove-arm I'm sorry for being pushy, I think it would be great if we will get this to 3.4 so arm64 will be on par with amd64. thanks.
ping for second review
Hi @tom-cosgrove-arm, don’t want to push you too much but, is there any news or deadlines on the PR?
This PR is in a state where it requires action by @openssl/committers but the last update was 30 days ago
@tom-cosgrove-arm You haven't reviewed this PR for a very very long time. Is there any problem?
@t8m can we ask anyone else from maintainers look at this PR?
up
@tom-cosgrove-arm
It seems the review is done, can you merge this PR? @t8m
24 hours has passed since 'approval: done' was set, but as this PR has been updated in that time the label 'approval: ready to merge' is not being automatically set. Please review the updates and set the label manually.
Merged to the master branch after commit message tweaks. Thank you.
crypto/aes/aesv8-armx.S:12:1: error: unknown directive
.previous
^
make[1]: *** [Makefile:5360: crypto/aes/libcrypto-lib-aesv8-armx.obj] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:2258: build_sw] Error 2
This PR breaks MinGW Arm64 builds. Looks like #23997
@Andarwinux please open a new issue.
@Andarwinux please open a new issue.
https://github.com/openssl/openssl/issues/26415