i2pd immediately segfaults on OpenBSD i386 hardware
I have been running i2pd 2.46.1 from OpenBSD 7.2 ports successfully (https://github.com/PurpleI2P/i2pd/issues/1509#issuecomment-2146716079).
After upgrading, in both, OpenBSD 7.4 (with i2pd version 2.49) and OpenBSD 7.5 (with i2pd version 2.52), i2pd immediately terminates with error Illegal instruction.
Here is the backtrace:
(gdb) bt
#0 0x17c7a7dc in _GLOBAL__sub_I_Daemon.cpp () from /root/i2pd
#1 0x0fb79e1e in _dl_find_shlib (sodp=0x70c76400, searchpath=0xcf7f6f24,
nohints=263716900) at util.h:143
#2 0x0fb7463b in _dl_boot (argv=0xcf7f6f24, envp=0xcf7f6f2c,
dyn_loff=263651328, dl_data=0xcf7f6edc)
at /usr/src/libexec/ld.so/loader.c:769
#3 0x0fb7534f in _dl_finalize_object (objname=0x0, dynp=Variable "dynp" is not available.
)
at /usr/src/libexec/ld.so/resolve.c:327
#4 0xcf7f6fac in ?? ()
#5 0x00000000 in ?? ()
This is the hardware:
hw.machine=i386
hw.model=Geode(TM) Integrated Processor by AMD PCS ("AuthenticAMD" 586-class)
hw.ncpu=1
hw.byteorder=1234
hw.pagesize=4096
hw.cpuspeed=500
hw.vendor=Soekris Engineering
hw.product=net5501
hw.physmem=536363008
On a i386 QEMU virtual machine (same versions), i2pd runs as expected.
The following is equal on both systems:
# file `which i2pd`
/usr/local/bin/i2pd: ELF 32-bit LSB shared object, Intel 80386, version 1
# cksum /usr/local/bin/i2pd
1503362510 7636516 /usr/local/bin/i2pd
But ld.so differs:
# sysctl hw.vendor && cksum /usr/libexec/ld.so
hw.vendor=Soekris Engineering
3844534076 247864 /usr/libexec/ld.so
# sysctl hw.vendor && cksum /usr/libexec/ld.so
hw.vendor=QEMU
3079716773 247864 /usr/libexec/ld.so
Try building binary yourself. 0c924836cf9ae04cd12e2647e6edd5c0f896ff7b fixes regression with LibreSSL by the way, so it is better to build version after this commit.
I don't know where you got your binary from, but this file https://cdn.openbsd.org/pub/OpenBSD/7.5/packages/i386/i2pd-2.50.2.tgz was built with SSE instructions enabled, which your CPU does not have.
Probably global compiler defaults were changed at some point, so specifying -march=i586 is needed now.
specifying
-march=i586is needed now
I am not sure if I did this right - I added -march=i586 to Makefile.bsd like this (git master branch):
CXXFLAGS ?= ${CXX_DEBUG} -Wall -Wextra -Wno-unused-parameter -pedantic -Wno-misleading-indentation -march=i586
... and on a second attempt I updated CMakeLists.txt, even though I think it should have the same effect:
--- a/build/CMakeLists.txt
+++ b/build/CMakeLists.txt
@@ -197,6 +197,7 @@ if(UNIX)
# "'sleep_for' is not a member of 'std::this_thread'" in gcc 4.7/4.8
add_definitions("-D_GLIBCXX_USE_NANOSLEEP=1")
endif()
+ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=i586")
endif()
In any case, i2pd still fails, but now the backtrace looks like this:
(gdb) bt
#0 0x1a524a0a in __cxx_global_var_init.8 () from /root/i2pd/build/i2pd
#1 0x1a5282cd in _GLOBAL__sub_I_Daemon.cpp () from /root/i2pd/build/i2pd
#2 0x080e7a0e in kdoprnt (fd=Variable "fd" is not available.
) at /usr/src/libexec/ld.so/dl_printf.c:80
#3 0x080e351b in _dl_boot (argv=0xcf7ca6f4, envp=0xcf7ca6fc, dyn_loff=135131136, dl_data=0xcf7ca6ac) at /usr/src/libexec/ld.so/loader.c:746
#4 0x080ed25f in _dl_rtld (object=0x0) at /usr/src/libexec/ld.so/loader.c:1036
#5 0xcf7ca77c in ?? ()
#6 0x00000000 in ?? ()
If crashing location is different depending on -march option, then option was specified in correct place.
I suspect that boost or libressl may be built with SSE as well.
It also worth to try -march=i486 and -march=i386 because 586-class from your output may be not entirely correct.
Showing which exact command resulted in crash may be useful as well.
Here is what I found after searching how to do this:
https://stackoverflow.com/a/40223712 ((gdb) layout asm is the right command probably)
I looked at boost libraries from: https://cdn.openbsd.org/pub/OpenBSD/7.5/packages/i386/boost-1.84.0p2v0.tgz
Did not found SSE there yet, but saw cmovz command.
Looks like it was introduced in Pentium Pro.
Not sure if your processor is Pentium Pro compatible.
saw cmovz command
(gdb) layout asm gives this (scrollable) view:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
>β0x1a0f09ea <__cxx_global_var_init.8+3978> movsd 0x1e8(%esp),%xmm0 β
β0x1a0f09f3 <__cxx_global_var_init.8+3987> mov %esp,%eax β
β0x1a0f09f5 <__cxx_global_var_init.8+3989> movsd %xmm0,0x4(%eax) β
β0x1a0f09fa <__cxx_global_var_init.8+3994> lea 0x118(%esp),%ecx β
β0x1a0f0a01 <__cxx_global_var_init.8+4001> mov %ecx,0xc(%eax) β
β0x1a0f0a04 <__cxx_global_var_init.8+4004> lea 0x2310(%ebx),%ecx β
β0x1a0f0a0a <__cxx_global_var_init.8+4010> mov %ecx,(%eax) β
β0x1a0f0a0c <__cxx_global_var_init.8+4012> call 0x1a0f4a30 <_ZNSt3__13mapINS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEN3i2p4i18n8langDβ
β0x1a0f0a11 <__cxx_global_var_init.8+4017> jmp 0x1a0f0a16 <__cxx_global_var_init.8+4022> β
β0x1a0f0a16 <__cxx_global_var_init.8+4022> lea 0x520(%esp),%eax β
β0x1a0f0a1d <__cxx_global_var_init.8+4029> mov %eax,0x30(%esp) β
β0x1a0f0a21 <__cxx_global_var_init.8+4033> add $0x440,%eax β
β0x1a0f0a26 <__cxx_global_var_init.8+4038> mov %eax,0x34(%esp) β
β0x1a0f0a2a <__cxx_global_var_init.8+4042> mov 0x34(%esp),%eax β
β0x1a0f0a2e <__cxx_global_var_init.8+4046> mov 0x100(%esp),%ebx β
β0x1a0f0a35 <__cxx_global_var_init.8+4053> add $0xffffffc0,%eax β
β0x1a0f0a38 <__cxx_global_var_init.8+4056> mov %eax,0x2c(%esp) β
β0x1a0f0a3c <__cxx_global_var_init.8+4060> mov %eax,(%esp) β
β0x1a0f0a3f <__cxx_global_var_init.8+4063> call 0x1a0f4b40 <_ZNSt3__14pairIKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEEN3i2p4i18n8lanβ
β0x1a0f0a44 <__cxx_global_var_init.8+4068> mov 0x30(%esp),%ecx β
β0x1a0f0a48 <__cxx_global_var_init.8+4072> mov 0x2c(%esp),%eax β
β0x1a0f0a4c <__cxx_global_var_init.8+4076> cmp %ecx,%eax β
β0x1a0f0a4e <__cxx_global_var_init.8+4078> mov %eax,0x34(%esp) β
β0x1a0f0a52 <__cxx_global_var_init.8+4082> jne 0x1a0f0a2a <__cxx_global_var_init.8+4042> β
β0x1a0f0a58 <__cxx_global_var_init.8+4088> mov 0x100(%esp),%ebx β
β0x1a0f0a5f <__cxx_global_var_init.8+4095> lea 0x1f0(%esp),%eax β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
core process 380311 In: __cxx_global_var_init.8 Line: ?? PC: 0x1a0f09ea
It also worth to try
-march=i486and-march=i386because586-classfrom your output may be not entirely correct.
I will check - compiling with -march=i386 right now ...
I suspect that boost or libressl may be built with SSE as well.
Would that affect the issue even though I'm building a dynamic library?
movsd is SSE command. I wonder where it comes from.
It should not appear with -march=i586.
Did you cleaned object files from previous builds?
I wonder if it comes from C++ runtime.
If that's the case, compiler toolchain will need rebuilding with -march.
Would that affect the issue even though I'm building a dynamic library?
Didn't knew this. Should be fine then, in case of problems, they will appear in library module, not in i2pd binary code. If LibreSSL is dynamic as well, then no problems should be there too.
I generally see less lines with layout asm with the -march-i386-build, but it results in the same issue ("movsd").
movsdis SSE command. I wonder where it comes from. Did you cleaned object files from previous builds?
I did a simple make clean.
Would that affect the issue even though I'm building a dynamic library?
Didn't knew this.
Actually, at the beginning I tried to build a static binary, but then linking would fail.
I guess with i386 failing, it doesn't make much sense to try i486?
I guess with i386 failing, it doesn't make much sense to try i486?
There should be no SSE commands for i386, i486 and i586.
I suspect USE_AESNI option may cause problems.
It is enabled by default, so try setting it to no.
You may also look at exact options used when source files are compiled.
There should be no extra -march options and no -msse.
Looks like I figured out what happened.
USE_AESNI enables -maes, which automatically enables -msse.
Modern compilers allow to use AES-NI instructions without -maes option, but it can't be easily removed because it should be possible to build i2pd with older compilers as well.
I decided to try it myself - installed OpenBSD with install75.iso, git boost cmake ninja and built binary like so (from build directory):
cmake -DCMAKE_CXX_FLAGS="-march=i386" -DWITH_AESNI=OFF -DWITH_GIT_VERSION=ON -DCMAKE_BUILD_TYPE=Release -G Ninja .
ninja
Here is the binary which was produced: i2pd_openbsd_7_5_i386.zip
I see no SSE instructions there.
I can confirm your findings.
Also, I found that the error goes away even if I do not provide any march= at all, meaning that -DWITH_AESNI=OFF is enough to get a working executable for my CPU.
Note that -march=i386 and -march=i486 produce the same binary altogether:
# diff i2pd-i386 i2pd-i486 && echo same
same
USE_AESNIenables-maes, which automatically enables-msse.
Can I get AES-NI without SSE?
I see no SSE instructions there.
How can I check this myself on the system that doesn't produce the error?
this file https://cdn.openbsd.org/pub/OpenBSD/7.5/packages/i386/i2pd-2.50.2.tgz was built with SSE instructions enabled
I wonder what changed in comparison to their 7.2 package, which still worked on my hardware ...
How can I check this myself on the system that doesn't produce the error?
I was searching for "xmm" text among disassembly made by IDA. Ghidra should work as well, but I have no experience with it. Also it is possible to instruct compiler to generate assembly listing.
Can I get AES-NI without SSE?
AES-NI instructions use SSE registers, so in real CPUs, supporting AES-NI, both instruction sets will be available. However, i2pd uses dynamic detection and don't execute AES-NI instructions if CPU reports no support for them. So theoretically it is possible to make binary with SSE + AES-NI instructions located only in places, which are guarded by such check. It will require lots of thinking and testing however.
I wonder what changed in comparison to their 7.2 package, which still worked on my hardware ...
It is possible to try building it yourself and see, but I don't think it's worth it.
Another interesting thing I did not expect, but just came to my mind was this check:
# diff i2pd-i586-noaesni i2pd-noaesni && echo same
same
In any case, I'm back up running.
Pleasure working with you - thanks for the guidance!