Segfault on sample code on init cache - $100 bug fix bounty
Hoping for some pointers on what I might be doing wrong here -
Got the following sample code -
const char myKey[] = "RandomX example key";
const char myInput[] = "RandomX example input";
char hash[RANDOMX_HASH_SIZE];
std::cout << "Get flags" << std::endl;
randomx_flags flags = randomx_get_flags();
std::cout << flags << std::endl;
std::cout << "Allocate cache" << std::endl;
randomx_cache *myCache = randomx_alloc_cache(flags);
if (myCache == nullptr) {
std::cout << "Cache allocation failed" << std::endl;
return SerializeHash(*this);
}
std::cout << "Init cache" << std::endl;
randomx_init_cache(myCache, &myKey, sizeof myKey);
std::cout << "MyMachine" << std::endl;
with output -
Get flags
106
Allocate cache
Init cache
Segmentation fault (core dumped)
free -h is
total used free shared buff/cache available
Mem: 3.8Gi 262Mi 847Mi 1.0Mi 2.7Gi 3.3Gi
Swap: 8.0Gi 40Mi 8.0Gi
Running ubuntu 23.04. Tried compiling with both -DARCH=native and no -DARCH
Which CPU are you using?
Can you provide a stack trace? Run a debug build with gdb and when it crashes, use the bt command.
Thanks for the help, here's that info
Get flags
0
Allocate cache
Program received signal SIGSEGV, Segmentation fault.
0x0000555555a81a37 in randomx::generateSuperscalar(randomx::SuperscalarProgram&, randomx::Blake2Generator&) ()
(gdb) bt
#0 0x0000555555a81a37 in randomx::generateSuperscalar(randomx::SuperscalarProgram&, randomx::Blake2Generator&) ()
#1 0x0000555555a7f72d in randomx::initCache(randomx_cache*, void const*, unsigned long) ()
#2 0x0000555555a75067 in randomx_init_cache ()
#3 0x00005555559df520 in CBlockHeader::GetHash (this=this@entry=0x555555dc2098 <mainParams+1848>)
at primitives/block.cpp:38
#4 0x00005555558e002d in CMainParams::CMainParams (this=0x555555dc1960 <mainParams>) at chainparams.cpp:174
#5 0x0000555555600cbc in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
at /usr/include/c++/11/bits/std_thread.h:86
#6 _GLOBAL__sub_I_nMiningForkTime () at chainparams.cpp:1129
#7 0x00007ffff72d6ebb in call_init (env=<optimized out>, argv=0x7fffffffe4f8, argc=1) at ../csu/libc-start.c:145
#8 __libc_start_main_impl (main=0x5555555ea940 <main(int, char**)>, argc=1, argv=0x7fffffffe4f8, init=<optimized out>,
fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe4e8) at ../csu/libc-start.c:379
#9 0x0000555555607955 in _start ()
cpu info (host describes it as an Intel Xeon)
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Vendor ID: GenuineIntel
Model name: Intel Core Processor (Skylake, IBRS)
CPU family: 6
Model: 94
Thread(s) per core: 2
Core(s) per socket: 1
Socket(s): 1
Stepping: 3
BogoMIPS: 7391.99
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2
ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ss
se3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand h
ypervisor lahf_lm abm cpuid_fault invpcid_single pti ssbd ibrs ibpb fsgsbase bmi1 avx2 smep bmi2 e
rms invpcid xsaveopt arat
Virtualization features:
Hypervisor vendor: Microsoft
Virtualization type: full
Caches (sum of all):
L1d: 64 KiB (2 instances)
L1i: 64 KiB (2 instances)
L2: 4 MiB (1 instance)
L3: 16 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0,1
- Your stack trace doesn't include line numbers, which means you are not using a debug build of librandomx.a. Can you please repeat it with a debug build?
- Have you made any changes to the randomx code? Especially in files
configuration.horsuperscalar.cpp? - Can you enable the trace output? You'll have to edit
CMakeLists.txtand addadd_definitions(-DTRACE)somewhere near the top and rebuild librandomx.a.
- Ok.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Get flags
0
Allocate cache
Program received signal SIGSEGV, Segmentation fault.
0x0000555555a9e7b0 in randomx::SuperscalarInstructionInfo::getType (this=0x0) at /root/RandomX/src/superscalar.cpp:172 return type_;
(gdb) bt
#0 0x0000555555a9e7b0 in randomx::SuperscalarInstructionInfo::getType (this=0x0) at /root/RandomX/src/superscalar.cpp:172
#1 0x0000555555a9f44f in randomx::SuperscalarInstruction::getType (this=0x7fffffffd400) at /root/RandomX/src/superscalar.cpp:539
#2 0x0000555555a9c920 in randomx::generateSuperscalar (prog=..., gen=...) at /root/RandomX/src/superscalar.cpp:681
#3 0x0000555555a98849 in randomx::initCache (cache=0x555555f1f840, key=0x7fffffffdf10, keySize=20) at /root/RandomX/src/dataset.cpp:130
#4 0x0000555555a744c5 in randomx_init_cache (cache=0x555555f1f840, key=0x7fffffffdf10, keySize=20) at /root/RandomX/src/randomx.cpp:130
#5 0x00005555559de5c0 in CBlockHeader::GetHash (this=this@entry=0x555555de60d8 <mainParams+1848>) at primitives/block.cpp:38
#6 0x00005555558df0cd in CMainParams::CMainParams (this=0x555555de59a0 <mainParams>) at chainparams.cpp:174
#7 0x00005555556008bc in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
at /usr/include/c++/11/bits/std_thread.h:86
#8 _GLOBAL__sub_I_nMiningForkTime () at chainparams.cpp:1129
#9 0x00007ffff72d6ebb in call_init (env=<optimized out>, argv=0x7fffffffe4d8, argc=1) at ../csu/libc-start.c:145
#10 __libc_start_main_impl (main=0x5555555ea540 <main(int, char**)>, argc=1, argv=0x7fffffffe4d8, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=0x7fffffffe4c8) at ../csu/libc-start.c:379
#11 0x00005555556069f5 in _start ()
- No changes yet, just trying to get the default running before experimenting.
- Will try next
You'll have to edit CMakeLists.txt and add add_definitions(-DTRACE) somewhere near the top and rebuild librandomx.a.
Did this but doesn't seem to have made any difference to the output. Sorry, I'm unfamiliar with these debugging tools.
This stacktrace is a bit more helpful, but I still don't see why it crashes.
The crash happens on the very first iteration here:
https://github.com/tevador/RandomX/blob/901f8ef765e7c274852dcb4d477247fd6747a5b8/src/superscalar.cpp#L681
currentInstruction is initialized to SuperscalarInstruction::Null, which is initialized with a pointer to SuperscalarInstructionInfo::NOP, but in the call to SuperscalarInstructionInfo::getType, your this pointer is null, which shouldn't happen.
Can you try to compile and run just the example code from here without the bitcoin wrapper you are using? https://github.com/tevador/RandomX/blob/master/src/tests/api-example1.c
Which compiler version are you using?
Compiling using gcc api-example1.c -L/root/RandomX/build -lrandomx -lstdc++ -lm -lc succeeds and program runs correctly. Also the benchmark and tests run correctly.
gcc -v
root@bchx:~/RandomX# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.3.0-1ubuntu1~22.04.1' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04.1)
I'm compiling in the context of the Bitcoin Unlimited makefile + links to the compiled librandomx.a -
https://gitlab.com/bitcoinunlimited/BCHUnlimited/-/blob/dev/src/Makefile.am?ref_type=heads
maybe it's adding some incompatible compiler option?
It appears that the constructors of static globals are not being called in your case. It's probably a problem with the linker. Can you try linking with the --whole-archive option? Otherwise I'm out of ideas.
OK, Thanks for the debugging help and the suggestion. Unfortunately --whole-archive hasn't made any difference :(
@sickpig - any idea why calling randomx in the context of a bitcoin-unlimited makefile build would cause a segfault? Any options I could tweak to avoid it?
Just an update - tried with a clean Ubuntu 20.04 install - same error.
Adding a small $50 bounty for a fix, or suggestion that leads to a fix.
@sickpig - any idea why calling randomx in the context of a bitcoin-unlimited makefile build would cause a segfault? Any options I could tweak to avoid it?
sorry but what do you mean in the context of bitcoin-unlimited? or even better how BU is related to randomX?
what are you trying to do?
@sickpig - any idea why calling randomx in the context of a bitcoin-unlimited makefile build would cause a segfault? Any options I could tweak to avoid it?
sorry but what do you mean in the context of bitcoin-unlimited? or even better how BU is related to randomX?
what are you trying to do?
Hey, thanks for stopping by - I'm working on a project that uses the BU codebase (and build/makefile), but I'm using the RandomX library as a PoW function. When I call the RandomX functions, I get a segmentation fault. I thought you might have a knowledge of anything special or unusual going on with the BU build process that might cause the error?
@FreeTrade I have found this from Upwork.
It compiles and runs fine for me, no seg fault:
Get flags
106
Allocate cache
Init cache
MyMachine
I am using a Ubuntu OS.
This is how I have built the binary:
g++ segFault.cpp -o segFault -lrandomx
Full code:
#include <iostream>
#include <randomx.h>
int main()
{
const char myKey[] = "RandomX example key";
const char myInput[] = "RandomX example input";
char hash[RANDOMX_HASH_SIZE];
std::cout << "Get flags" << std::endl;
randomx_flags flags = randomx_get_flags();
std::cout << flags << std::endl;
std::cout << "Allocate cache" << std::endl;
randomx_cache *myCache = randomx_alloc_cache(flags);
if (myCache == nullptr) {
std::cout << "Cache allocation failed" << std::endl;
//return SerializeHash(*this);
return 1;
}
std::cout << "Init cache" << std::endl;
randomx_init_cache(myCache, &myKey, sizeof myKey);
std::cout << "MyMachine" << std::endl;
return 0;
}
Note: I have also built the RandomX library using the provided cmake instructions:
sudo make install [ 79%] Built target randomx [ 88%] Built target randomx-benchmark [ 94%] Built target randomx-codegen [100%] Built target randomx-tests Install the project... -- Install configuration: "Release" -- Installing: /usr/local/lib/librandomx.a -- Installing: /usr/local/include/randomx.h
@avrdan, thanks, yes the code runs if compiled standalone - the challenge seems to be getting it to run in the context of a Bitcoin Unlimited build.
Did you edit the Bitcoin Unlimited makefile to include the librandomx.a library? If yes, we need to add another entry like the following:
librandomx_a_CPPFLAGS = $(AM_CPPFLAGS) $(BITCOIN_INCLUDES)
librandomx_a_CXXFLAGS = $(AM_CXXFLAGS) $(PIE_FLAGS)
librandomx_a_SOURCES = \
randomx.h
I am assuming you have already done this? In any case, including this library should work the same way, regardless of the main project.
Yes, thanks, I'm going to make a repo with my changes so it is clearer what the problem is.
Ok, here's the full repo -
https://gitlab.com/FreeTrade68/bchrx
The changes to include randomx library
https://gitlab.com/FreeTrade68/bchrx/-/compare/dev...dev?from_project_id=19725714
(Quite possible I've done something silly in the makefile that causes the problem. Configuring libs is not my strong suit)
To build
Probably need these
sudo apt-get install build-essential libtool autotools-dev autoconf automake pkg-config libssl-dev libevent-dev bsdmainutils git
sudo apt-get install libboost-all-dev
sudo apt-get install libminiupnpc-dev
sudo apt-get install libzmq3-dev
sudo apt install libdb5.3++ libdb5.3++-dev
then
git clone --single-branch https://gitlab.com/FreeTrade68/bchrx
cd bchrx/
./autogen.sh
./configure --disable-tests --with-incompatible-bdb --enable-upnp-default --with-gui=no
make
Error
root@bchx:~/bchrx# ./src/bitcoind
Get flags
0
Allocate cache
Segmentation fault (core dumped)
Are you sure /bchrx/src/randomx is the path to the installation of your RandomX? Meaning that you have the include and library files in there?
I used the default install folders, so the lib goes under /usr/local/lib and the include under /usr/local/include. In this way, the library is installed in usr. This is in anyway cleaner, as you shouldn't have libraries in a src folder.. but this may just be a side note.
Pretty sure it's not a matter of wrong paths. It compiles and some of the functions are successfully called. No doubt there are ways to clean up the folder location - but focused on just getting it running first.
This must be more difficult that I had thought. Increasing bounty to $100
Good Job.
@FreeTrade check your messages on upwork from me.
Regards Anshul Mittal
@anshulmttl Thanks Anshul - this is now a public bounty so I can't assign it as a project. First solution posted here wins the bounty.
@FreeTrade Since you have contacted me on Upwork you will have to provide me project on Upwork only.
Ok, the bounty is suspended while I confirm Anshul has found a solution.
Is this issue solved?
@FreeTrade you can also try to build BCHUnlimited code with --disable-hardening flag for configure.ac. These "hardening" flags might mess up linking with RandomX library. I suspect it's some incompatibility between compiler/linker flags you use for RandomX and for BCHUnlimited.
Thanks for the suggestion - actually yes I did try the --disable-hardening but alas it didn't resolve it for me. Will likely move forward with @anshulmttl resolution, although I accept your reservations with it.
Accepted @anshulmttl resolution for the problem and paid $100 bounty. https://github.com/tevador/RandomX/pull/272
If anyone else runs into the same issue - be aware of @SChernykh note that this is probably an issue with the build/compile so this may be a workaround to a different problem rather than a fix.