GOTCHA
GOTCHA copied to clipboard
Build fails starting with glibc v2.34
As of version 2.34 of glibc (which is the latest as of Oct 2021), GOTCHA fails to build. The problem is that GOTCHA uses _dl_sym
and in v2.34, this is no longer exported.
This is going to take some effort to fix. Gotcha needs to wrap dlsym() to implement its function interception. But you need _dl_sym() to correctly implement a wrapper, since wrappers change the semantics of RTLD_NEXT.
@rgmiller @mplegendre Is this the failure you are seeing? I'm getting this with spack:
==> No patches needed for gotcha
==> gotcha: Executing phase: 'cmake'
==> gotcha: Executing phase: 'build'
==> Error: ProcessError: Command exited with status 2:
'make' '-j12'
9 errors found in build log:
109 make[2]: Entering directory '/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4'
110 [ 89%] Building C object src/example/autotee/CMakeFiles/autotee_test.dir/test_autotee.c.o
111 cd /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/autotee && /home/jgalarowicz/spack-12-16-2021/spack/lib/spac
k/env/gcc/gcc -I/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-src/include -I/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-
develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-src/src -O2 -g -DNDEBUG -MD -MT src/example/autotee/CMakeFiles/autotee_test.dir/test_autotee.c.o -MF CMakeFiles/autotee_test.dir/test_autotee.c.o.d
-o CMakeFiles/autotee_test.dir/test_autotee.c.o -c /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-src/src/example/autotee/test_autotee.
c
112 [ 94%] Linking C executable symb_look
113 cd /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/minimal && /home/jgalarowicz/spack-12-16-2021/spack/opt/spac
k/linux-ubuntu21.10-skylake/gcc-8.5.0/cmake-3.21.4-vvjd47wit3w4na6zyu7cpowzkzumihrl/bin/cmake -E cmake_link_script CMakeFiles/symb_look.dir/link.txt --verbose=1
114 /home/jgalarowicz/spack-12-16-2021/spack/lib/spack/env/gcc/gcc -O2 -g -DNDEBUG -rdynamic CMakeFiles/symb_look.dir/symbolLookup.c.o -o symb_look -Wl,-rpath,/opt/tempspack/spack-stage/jgalarowicz
/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/minimal:/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy
2bfhh/spack-build-cqnzpr4/src libwrap_me.so ../../libgotcha.so.2.0.2
>> 115 /usr/bin/ld: ../../libgotcha.so.2.0.2: undefined reference to `_dl_sym'
>> 116 collect2: error: ld returned 1 exit status
>> 117 make[2]: *** [src/example/minimal/CMakeFiles/symb_look.dir/build.make:102: src/example/minimal/symb_look] Error 1
118 make[2]: Leaving directory '/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4'
>> 119 make[1]: *** [CMakeFiles/Makefile2:253: src/example/minimal/CMakeFiles/symb_look.dir/all] Error 2
120 make[1]: *** Waiting for unfinished jobs....
121 [100%] Linking C executable autotee_test
122 cd /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/autotee && /home/jgalarowicz/spack-12-16-2021/spack/opt/spac
k/linux-ubuntu21.10-skylake/gcc-8.5.0/cmake-3.21.4-vvjd47wit3w4na6zyu7cpowzkzumihrl/bin/cmake -E cmake_link_script CMakeFiles/autotee_test.dir/link.txt --verbose=1
123 /home/jgalarowicz/spack-12-16-2021/spack/lib/spack/env/gcc/gcc -O2 -g -DNDEBUG -rdynamic CMakeFiles/autotee_test.dir/test_autotee.c.o -o autotee_test -Wl,-rpath,/opt/tempspack/spack-stage/jgala
rowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/autotee:/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrz
hp7boy2bfhh/spack-build-cqnzpr4/src libautotee.so ../../libgotcha.so.2.0.2
>> 124 /usr/bin/ld: ../../libgotcha.so.2.0.2: undefined reference to `_dl_sym'
>> 125 collect2: error: ld returned 1 exit status
>> 126 make[2]: *** [src/example/autotee/CMakeFiles/autotee_test.dir/build.make:102: src/example/autotee/autotee_test] Error 1
127 make[2]: Leaving directory '/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4'
>> 128 make[1]: *** [CMakeFiles/Makefile2:200: src/example/autotee/CMakeFiles/autotee_test.dir/all] Error 2
129 make[1]: Leaving directory '/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4'
>> 130 make: *** [Makefile:139: all] Error 2
See build log for details:
/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-out.txt
Yes, that's the failure that goes with this bug.
@mplegendre Are there any workarounds for this issue so we can continue to use gotcha? Thanks.
we started seeing this while using gotcha via caliper
~~the following appears to work, but I can't say if it's even triggered by our workfow.~~
Edit: code removed.
we started seeing this while using gotcha via caliper
the following appears to work, but I can't say if it's even triggered by our workflow
Unfortunately that's not going to work - the function is a wrapper for dlsym
, so you'll just run into an infinite recursion if you call it from within. Simply using the original dlsym via orig_dlsym
doesn't work either - I tried that and weird things happen. I don't know enough about how ld works to fix it.
Meanwhile if this breaks building Caliper you can turn off Gotcha support with -DWITH_GOTCHA=Off
. This primarily affects the MPI wrappers, which will fall back to PMPI and have slightly higher runtime overheads than the selective wrapping we can do with GOTCHA. You also lose the ability to wrap pthread_create
and malloc/free, but they're not needed that often. If you just want to do region profiling you should be fine with turning Gotcha off in Caliper.
@daboehme Since you have an internal build of GOTCHA in Caliper, you might just want to pull in the work-around in #101 in the event that it isn't accepted.
@mplegendre Hi Matt - we are dead in the water with building our survey performance tool on newer systems. We don't use caliper so it sounds like any workarounds mentioned above don't help in our gotcha usage scenario. Will there be a general fix sometime soon or do we need to find an alternative to gotcha to use in our survey tool? Thanks for your time!
@jgalarowicz I don't use caliper either. The fix in the PR #101 is not specific to it.
@jrmadsen Thank you for letting me know, it appeared to be a caliper specific fix when I looked at it. I will give it a try. Thanks again!
I hit this bug today, and with caliper. Just came here to say "I have been got." :laughing:
Any word on when/if #101 might be included in a release? We're pondering including it as a patch in the spack package since there are some gotcha dependants that we can't deploy on our target platforms because of this issue.
For the record, just yesterday, I added another patch that basically makes wrapping dlopen
and dlsym
optional because I found it was the underlying cause of a deadlock when I wrapped MPI_Init
+ some pthread lock functions and MPI dlopened + dlsymed some functions.
Also, even before that yesterday, I personally modified my version to always use the fix in #101 (but configurable at build time) because I encountered very occasional issues with OMPT (openmp-tools) failing in it's search for ompt_start_tool
(via dlsym(RTLD_NEXT, ...)
). Switching to the fix in #101 even for glibc < v2.34 fixed it.