GOTCHA icon indicating copy to clipboard operation
GOTCHA copied to clipboard

Build fails starting with glibc v2.34

Open rgmiller opened this issue 2 years ago • 13 comments

As of version 2.34 of glibc (which is the latest as of Oct 2021), GOTCHA fails to build. The problem is that GOTCHA uses _dl_sym and in v2.34, this is no longer exported.

rgmiller avatar Oct 25 '21 20:10 rgmiller

This is going to take some effort to fix. Gotcha needs to wrap dlsym() to implement its function interception. But you need _dl_sym() to correctly implement a wrapper, since wrappers change the semantics of RTLD_NEXT.

mplegendre avatar Oct 26 '21 20:10 mplegendre

@rgmiller @mplegendre Is this the failure you are seeing? I'm getting this with spack:

==> No patches needed for gotcha
==> gotcha: Executing phase: 'cmake'
==> gotcha: Executing phase: 'build'
==> Error: ProcessError: Command exited with status 2:
    'make' '-j12'

9 errors found in build log:
     109    make[2]: Entering directory '/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4'
     110    [ 89%] Building C object src/example/autotee/CMakeFiles/autotee_test.dir/test_autotee.c.o
     111    cd /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/autotee && /home/jgalarowicz/spack-12-16-2021/spack/lib/spac
            k/env/gcc/gcc  -I/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-src/include -I/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-
            develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-src/src -O2 -g -DNDEBUG -MD -MT src/example/autotee/CMakeFiles/autotee_test.dir/test_autotee.c.o -MF CMakeFiles/autotee_test.dir/test_autotee.c.o.d
             -o CMakeFiles/autotee_test.dir/test_autotee.c.o -c /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-src/src/example/autotee/test_autotee.
            c
     112    [ 94%] Linking C executable symb_look
     113    cd /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/minimal && /home/jgalarowicz/spack-12-16-2021/spack/opt/spac
            k/linux-ubuntu21.10-skylake/gcc-8.5.0/cmake-3.21.4-vvjd47wit3w4na6zyu7cpowzkzumihrl/bin/cmake -E cmake_link_script CMakeFiles/symb_look.dir/link.txt --verbose=1
     114    /home/jgalarowicz/spack-12-16-2021/spack/lib/spack/env/gcc/gcc -O2 -g -DNDEBUG -rdynamic CMakeFiles/symb_look.dir/symbolLookup.c.o -o symb_look  -Wl,-rpath,/opt/tempspack/spack-stage/jgalarowicz
            /spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/minimal:/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy
            2bfhh/spack-build-cqnzpr4/src libwrap_me.so ../../libgotcha.so.2.0.2
  >> 115    /usr/bin/ld: ../../libgotcha.so.2.0.2: undefined reference to `_dl_sym'
  >> 116    collect2: error: ld returned 1 exit status
  >> 117    make[2]: *** [src/example/minimal/CMakeFiles/symb_look.dir/build.make:102: src/example/minimal/symb_look] Error 1
     118    make[2]: Leaving directory '/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4'
  >> 119    make[1]: *** [CMakeFiles/Makefile2:253: src/example/minimal/CMakeFiles/symb_look.dir/all] Error 2
     120    make[1]: *** Waiting for unfinished jobs....
     121    [100%] Linking C executable autotee_test
     122    cd /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/autotee && /home/jgalarowicz/spack-12-16-2021/spack/opt/spac
            k/linux-ubuntu21.10-skylake/gcc-8.5.0/cmake-3.21.4-vvjd47wit3w4na6zyu7cpowzkzumihrl/bin/cmake -E cmake_link_script CMakeFiles/autotee_test.dir/link.txt --verbose=1
     123    /home/jgalarowicz/spack-12-16-2021/spack/lib/spack/env/gcc/gcc -O2 -g -DNDEBUG -rdynamic CMakeFiles/autotee_test.dir/test_autotee.c.o -o autotee_test  -Wl,-rpath,/opt/tempspack/spack-stage/jgala
            rowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4/src/example/autotee:/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrz
            hp7boy2bfhh/spack-build-cqnzpr4/src libautotee.so ../../libgotcha.so.2.0.2
  >> 124    /usr/bin/ld: ../../libgotcha.so.2.0.2: undefined reference to `_dl_sym'
  >> 125    collect2: error: ld returned 1 exit status
  >> 126    make[2]: *** [src/example/autotee/CMakeFiles/autotee_test.dir/build.make:102: src/example/autotee/autotee_test] Error 1
     127    make[2]: Leaving directory '/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4'
  >> 128    make[1]: *** [CMakeFiles/Makefile2:200: src/example/autotee/CMakeFiles/autotee_test.dir/all] Error 2
     129    make[1]: Leaving directory '/opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-cqnzpr4'
  >> 130    make: *** [Makefile:139: all] Error 2

See build log for details:
  /opt/tempspack/spack-stage/jgalarowicz/spack-stage-gotcha-develop-cqnzpr4oo4bc5gxr54jrzhp7boy2bfhh/spack-build-out.txt

jgalarowicz avatar Dec 16 '21 23:12 jgalarowicz

Yes, that's the failure that goes with this bug.

mplegendre avatar Dec 17 '21 16:12 mplegendre

@mplegendre Are there any workarounds for this issue so we can continue to use gotcha? Thanks.

jgalarowicz avatar Dec 17 '21 16:12 jgalarowicz

we started seeing this while using gotcha via caliper

~~the following appears to work, but I can't say if it's even triggered by our workfow.~~

Edit: code removed.

PhilipDeegan avatar Dec 18 '21 13:12 PhilipDeegan

we started seeing this while using gotcha via caliper

the following appears to work, but I can't say if it's even triggered by our workflow

Unfortunately that's not going to work - the function is a wrapper for dlsym, so you'll just run into an infinite recursion if you call it from within. Simply using the original dlsym via orig_dlsym doesn't work either - I tried that and weird things happen. I don't know enough about how ld works to fix it.

Meanwhile if this breaks building Caliper you can turn off Gotcha support with -DWITH_GOTCHA=Off. This primarily affects the MPI wrappers, which will fall back to PMPI and have slightly higher runtime overheads than the selective wrapping we can do with GOTCHA. You also lose the ability to wrap pthread_create and malloc/free, but they're not needed that often. If you just want to do region profiling you should be fine with turning Gotcha off in Caliper.

daboehme avatar Dec 23 '21 20:12 daboehme

@daboehme Since you have an internal build of GOTCHA in Caliper, you might just want to pull in the work-around in #101 in the event that it isn't accepted.

jrmadsen avatar Dec 26 '21 17:12 jrmadsen

@mplegendre Hi Matt - we are dead in the water with building our survey performance tool on newer systems. We don't use caliper so it sounds like any workarounds mentioned above don't help in our gotcha usage scenario. Will there be a general fix sometime soon or do we need to find an alternative to gotcha to use in our survey tool? Thanks for your time!

jgalarowicz avatar Jan 07 '22 15:01 jgalarowicz

@jgalarowicz I don't use caliper either. The fix in the PR #101 is not specific to it.

jrmadsen avatar Jan 07 '22 18:01 jrmadsen

@jrmadsen Thank you for letting me know, it appeared to be a caliper specific fix when I looked at it. I will give it a try. Thanks again!

jgalarowicz avatar Jan 07 '22 18:01 jgalarowicz

I hit this bug today, and with caliper. Just came here to say "I have been got." :laughing:

vsoch avatar Mar 07 '22 01:03 vsoch

Any word on when/if #101 might be included in a release? We're pondering including it as a patch in the spack package since there are some gotcha dependants that we can't deploy on our target platforms because of this issue.

wspear avatar Aug 09 '22 17:08 wspear

For the record, just yesterday, I added another patch that basically makes wrapping dlopen and dlsym optional because I found it was the underlying cause of a deadlock when I wrapped MPI_Init + some pthread lock functions and MPI dlopened + dlsymed some functions.

Also, even before that yesterday, I personally modified my version to always use the fix in #101 (but configurable at build time) because I encountered very occasional issues with OMPT (openmp-tools) failing in it's search for ompt_start_tool (via dlsym(RTLD_NEXT, ...)). Switching to the fix in #101 even for glibc < v2.34 fixed it.

jrmadsen avatar Aug 09 '22 18:08 jrmadsen