Bear
Bear copied to clipboard
Bear is stuck
I can't build GCC with bear, where it's stack at the end of compilation:
bear -- make -j16
...
libtool: link: ( cd ".libs" && rm -f "libgfortran.la" && ln -s "../libgfortran.la" "libgfortran.la" )
make[3]: Leaving directory '/dev/shm/objdir/x86_64-pc-linux-gnu/libgfortran'
make[2]: Leaving directory '/dev/shm/objdir/x86_64-pc-linux-gnu/libgfortran'
make[1]: Leaving directory '/dev/shm/objdir'
There's process tree:
Where I'm using the latest release:
$ bear --version
bear 3.0.18
Hey Martin, I need more context in order to help. Could you fill out the issue template? Also, run the command with verbose flag and attach the output? Thanks!
All right, so info from the issue template would be:
uname -a
Linux marxinbox.suse.cz 5.16.2-1-default #1 SMP PREEMPT Mon Jan 24 18:27:48 UTC 2022 (0d710a8) x86_64 x86_64 x86_64 GNU/Linux
Bear is from openSUSE TW distribution, normally installed package.
Using --verbose
leads to a different error:
$ bear --verbose -- make
...
g++: fatal error: cannot execute 'cc1plus': execvp: No such file or directory
...
Interesting log!!!
I've seen the message g++: fatal error: cannot execute 'cc1plus': execvp: No such file or directory
when the PATH
environment is empty. And from the logs I see it is empty from the very first commands after Bear executes make
.
Is that possible that the Makefile
sets the environment empty?
Is this a GCC build? This was problematic on Kali linux too. GCC is executes cc1
or cc1plus
, which is not in the PATH
. (But it knows the location for it.) What I would expect that the GCC driver program (gcc
, cc
, g++
, c++
, etc.) executes the cc1plus
with full path. But what it does is just call execvp
only with the name of the program, which suppose to search in the PATH
.
I don't really know how to fix empty PATH
execution for GCC. I know that it works without Bear. (/usr/bin/env - /usr/bin/gcc -c /dev/null
works just fine.)
So what's weird is that w/o the --verbose
argument it works (until the end where it's stuck).
Plus I think GCC drive uses execve
with a full path if I see correctly:
strace -f -s 512 g++ -fcf-protection -fno-PIE -c -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -g -DIN_GCC -fPIC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -Ic-family -I/home/marxin/Programming/gcc/gcc -I/home/marxin/Programming/gcc/gcc/c-family -I/home/marxin/Programming/gcc/gcc/../include -I/home/marxin/Programming/gcc/gcc/../libcpp/include -I/home/marxin/Programming/gcc/gcc/../libcody -I/home/marxin/Programming/gcc/gcc/../libdecnumber -I/home/marxin/Programming/gcc/gcc/../libdecnumber/bid -I../libdecnumber -I/home/marxin/Programming/gcc/gcc/../libbacktrace -o c-family/c-common.o -MT c-family/c-common.o -MMD -MP -MF c-family/.deps/c-common.TPo /home/marxin/Programming/gcc/gcc/c-family/c-common.cc 2>&1 | grep execv
execve("/usr/bin/g++", ["g++", "-fcf-protection", "-fno-PIE", "-c", "-DIN_GCC_FRONTEND", "-DIN_GCC_FRONTEND", "-g", "-DIN_GCC", "-fPIC", "-fno-exceptions", "-fno-rtti", "-fasynchronous-unwind-tables", "-W", "-Wall", "-Wno-narrowing", "-Wwrite-strings", "-Wcast-qual", "-Wmissing-format-attribute", "-Woverloaded-virtual", "-pedantic", "-Wno-long-long", "-Wno-variadic-macros", "-Wno-overlength-strings", "-fno-common", "-DHAVE_CONFIG_H", "-I.", "-Ic-family", "-I/home/marxin/Programming/gcc/gcc", "-I/home/marxin/Programming/gcc/gcc/c-family", "-I/home/marxin/Programming/gcc/gcc/../include", "-I/home/marxin/Programming/gcc/gcc/../libcpp/include", "-I/home/marxin/Programming/gcc/gcc/../libcody", "-I/home/marxin/Programming/gcc/gcc/../libdecnumber", "-I/home/marxin/Programming/gcc/gcc/../libdecnumber/bid", "-I../libdecnumber", "-I/home/marxin/Programming/gcc/gcc/../libbacktrace", "-o", "c-family/c-common.o", "-MT", "c-family/c-common.o", "-MMD", "-MP", "-MF", "c-family/.deps/c-common.TPo", "/home/marxin/Programming/gcc/gcc/c-family/c-common.cc"], 0x7fffffffdbf8 /* 81 vars */) = 0
[pid 23829] execve("/usr/lib64/gcc/x86_64-suse-linux/11/cc1plus", ["/usr/lib64/gcc/x86_64-suse-linux/11/cc1plus", "-quiet", "-I", ".", "-I", "c-family", "-I", "/home/marxin/Programming/gcc/gcc", "-I", "/home/marxin/Programming/gcc/gcc/c-family", "-I", "/home/marxin/Programming/gcc/gcc/../include", "-I", "/home/marxin/Programming/gcc/gcc/../libcpp/include", "-I", "/home/marxin/Programming/gcc/gcc/../libcody", "-I", "/home/marxin/Programming/gcc/gcc/../libdecnumber", "-I", "/home/marxin/Programming/gcc/gcc/../libdecnumber/bid", "-I", "../libdecnumber", "-I", "/home/marxin/Programming/gcc/gcc/../libbacktrace", "-MMD", "c-family/c-common.d", "-MF", "c-family/.deps/c-common.TPo", "-MP", "-MT", "c-family/c-common.o", "-D_GNU_SOURCE", "-D", "IN_GCC_FRONTEND", "-D", "IN_GCC_FRONTEND", "-D", "IN_GCC", "-D", "HAVE_CONFIG_H", "/home/marxin/Programming/gcc/gcc/c-family/c-common.cc", "-quiet", "-dumpdir", "c-family/", "-dumpbase", "c-common.cc", "-dumpbase-ext", ".cc", "-mtune=generic", "-march=x86-64", "-g", "-Wextra", "-Wall", "-Wno-narrowing", "-Wwrite-strings", "-Wcast-qual", "-Wsuggest-attribute=format", "-Woverloaded-virtual", "-Wpedantic", "-Wno-long-long", "-Wno-variadic-macros", "-Wno-overlength-strings", "-fcf-protection=full", "-fPIC", "-fno-exceptions", "-fno-rtti", "-fasynchronous-unwind-tables", "-fno-common", "-o", "/tmp/ccUlqdms.s"], 0x501ec0 /* 85 vars */ <unfinished ...>
[pid 23829] <... execve resumed>) = 0
[pid 23829] write(3, "builtin_dgettext\"\n.LC1001:\n\t.string\t\"__builtin_dwarf_cfa\"\n.LC1002:\n\t.string\t\"__builtin_dwarf_sp_column\"\n.LC1003:\n\t.string\t\"__builtin_eh_return\"\n\t.align 8\n.LC1004:\n\t.string\t\"__builtin_eh_return_data_regno\"\n.LC1005:\n\t.string\t\"__builtin_execl\"\n.LC1006:\n\t.string\t\"__builtin_execlp\"\n.LC1007:\n\t.string\t\"__builtin_execle\"\n.LC1008:\n\t.string\t\"__builtin_execv\"\n.LC1009:\n\t.string\t\"__builtin_execvp\"\n.LC1010:\n\t.string\t\"__builtin_execve\"\n.LC1011:\n\t.string\t\"__builtin_exit\"\n.LC1012:\n\t.string\t\"__builtin_expect\"\n\t.align 8\n.LC10"..., 4096) = 4096
[pid 23830] execve("/usr/lib64/gcc/x86_64-suse-linux/11/../../../../x86_64-suse-linux/bin/as", ["/usr/lib64/gcc/x86_64-suse-linux/11/../../../../x86_64-suse-linux/bin/as", "-I", ".", "-I", "c-family", "-I", "/home/marxin/Programming/gcc/gcc", "-I", "/home/marxin/Programming/gcc/gcc/c-family", "-I", "/home/marxin/Programming/gcc/gcc/../include", "-I", "/home/marxin/Programming/gcc/gcc/../libcpp/include", "-I", "/home/marxin/Programming/gcc/gcc/../libcody", "-I", "/home/marxin/Programming/gcc/gcc/../libdecnumber", "-I", "/home/marxin/Programming/gcc/gcc/../libdecnumber/bid", "-I", "../libdecnumber", "-I", "/home/marxin/Programming/gcc/gcc/../libbacktrace", "--gdwarf-5", "--64", "-o", "c-family/c-common.o", "/tmp/ccUlqdms.s"], 0x501ec0 /* 85 vars */ <unfinished ...>
[pid 23830] <... execve resumed>) = 0
And yes, cc1plus
is really not on PATH
:
$ which cc1plus
which: no cc1plus in (/home/marxin/bin/valgrind/bin:/home/marxin/.local/bin:/home/marxin/bin:/usr/local/bin:/usr/bin:/bin:/home/marxin/Programming/gcc-util/boilerplate:/home/marxin/Programming/gcc-util/dumps:/home/marxin/Programming/script-misc)
Nice catch, Bear reports execvp
for that execution. Maybe that's going to be the problem. Will look at it on the weekend.
I'm trying to reproduce it with the latest master, but the bug does not show up... The verbose log shows that it's using execvp
and the cc1
is with full path.
This is what I'm running on Fedora or Arch:
$ /usr/bin/env - ../Bear.install/bin/bear -- /usr/bin/env - /usr/bin/gcc -c /tmp/empty.c
$ cat compile_commands.json
[
{
"arguments": [
"/usr/bin/gcc",
"-c",
"/tmp/empty.c"
],
"directory": "/tmp",
"file": "/tmp/empty.c"
}
]
Not sure this is related, but I've never been able to do a parallel build of gcc with bear (same symptom as @marxin, build is stuck at the end). I usually do a sequential build when I'm not doing any dev. IIRC, @philberty had the same issue). Happy to help if needed.
@dkm there are two issues you mention:
- Bear stuck at the end. Which might be just
citnames
running slow on the event file... Bear executes two binaries:intercept
which collects the executed process names and write it into an event file. (This event file is 4 GB for a linux kernel compilation.) Then it executescitnames
which reads the event file and filter out the compiler calls, detects duplicates, format the compilation database entries. (This process is single threaded, takes 1-2 min for the linux kernel.) To check this is the case for you by running theintercept
andcitnames
separately (asbear
would do). - Can't run parallel builds. The
intercept
running a gRPC service to collect the executions. And the gRPC has an open bug, about not closing the file descriptors fast enough. (In case if your build fails with "not enough file descriptor" message.) As a workaround, increase the max file descriptor limit.
I've seen intercept
to be slow in the past, because it was not able to write the entries fast enough. (The root cause for it was that intercept
was using SQLight to store events, but that was removed in recent versions.) If you running the compilation on a remote drive, or a drive which is slow, that can cause intercept
to look like stuck on the job.
Things to try our:
- Run the
intercept
andcitnames
instead of callingbear
. You will see at which phase it got stuck. - Run
intercept
with--verbose
, which will show how it goes. So we can see if the build is still going on or stuck somewhere inintercept
.
I have a similar problem; my problem arises whenever I use more than 6 jobs for make.
I get the same issue at various gcc build steps when used with crosstool-ng, even with only one job.
Maybe there is an issue with command redirection, since killing a cut -
process terminates everything (obviously with a failure).
During the build, even the main intercept
process uses quite a lot of CPU, it creates a giant >1GB file (logging all commands, including mv
for example) which later gets reduced to 3MB.
Bear 2.4.4 works, and is a lot faster.
FWIW, I've tried again this morning using a freshly built bear
: same behavior as described above (also tested with latest debian's package 3.0.20-1+b3
). But I'm not sure it really picks everything from my local install as I can see :
└─ bear -- make -j8 all
└─ intercept --library /usr/$LIB/bear/libexec.so --wrapper /usr/lib/x86_64-linux-gnu/bear/wrap
├─ intercept --library /usr/$LIB/bear/libexec.so --wrapper /usr/lib/x86_64-linux-gnu/bear/w
├─ intercept --library /usr/$LIB/bear/libexec.so --wrapper /usr/lib/x86_64-linux-gnu/bear/w
The bear
command is the correct one. Removing the debian package seems to correct this, so maybe there's something to be fixed in the search routine?
The process never seems to finish, I don't see any CPU activity, still have plenty of free RAM... So not sure what's wrong.:
1[|| 1.3%] Tasks: 166, 1077 thr, 232 kthr; 1 running
2[| 0.6%] Load average: 1.24 4.76 4.86
3[| 0.6%] Uptime: 17:45:36
4[||| 2.6%]
5[| 0.6%]
6[|| 1.3%]
7[||| 1.9%]
8[||| 2.5%]
9[||| 1.9%]
10[||| 1.9%]
11[| 0.6%]
12[| 0.6%]
Mem[|||||||||||||||||||||||||||||||||||| 5.12G/47.0G]
Swp[ 0K/48.8G]
PID△USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
3212785 dkm 20 0 6284 3944 3600 S 0.0 0.0 0:00.00 │ │ └─ bear -- make -j8 all
3212786 dkm 20 0 3771M 43836 15588 S 0.0 0.1 1:11.61 │ │ └─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local/st
3212787 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:00.03 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
3212788 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:00.00 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
3212789 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:01.00 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
3212790 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:00.00 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
3212791 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:00.00 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
3212792 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:00.00 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
3212793 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:00.00 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
3212794 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:00.00 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
3212795 dkm 20 0 3771M 43836 15588 S 0.0 0.1 0:00.00 │ │ ├─ intercept --library /home/dkm/local/stow/bear/lib/bear/libexec.so --wrapper /home/dkm/local
....
+1, I'm facing the same issue with gccrs. If I use bear -- make -j8
, the compilation succeeds, but the process never exits and like @dkm said, no CPU or RAM consumed either. It just sorta indefinitely hangs. I also noticed that the compile_commands.events.json
was around 253M for me. Upon CTRL-C
-ing the process, the compile_commands.json
contained 14 entries.
Thanks guys for this report. Is there a way that you could identify the command that hung the build? And create a minimal example which could help me to reproduce this error? That would be a great help to fix this bug.
Sure, I'll see what I can find, thanks @rizsotto
Is there a way to see what is being executed? I have a stuck bear
but don't really know how to dig it :)
Is there a way to see what is being executed? I have a stuck
bear
but don't really know how to dig it :)
Maybe try bear -vvvv -- make
?
I've got a stuck bear, with the last lines on the term being:
[16:48:00.206738, cs, 2849623] [pid: 2849622] recognition failed: No tools recognize this execution.
[16:48:00.207043, cs, 2849623] compilation entries created. [size: 0]
[16:48:00.207048, cs, 2849623] compilation entries to output. [size: 0]
[16:48:00.207176, cs, 2849623] compilation entries written. [size: 0]
[16:48:00.207186, cs, 2849623] succeeded with: 0
[16:48:00.208142, br, 2694891] Process wait request: done. [pid: 2849623]
[16:48:00.208207, br, 2694891] Running citnames finished. [Exited with 0]
[16:48:00.224768, br, 2694891] succeeded with: 0
The log is rather big with all my env, so not very comfortable putting it here. I can give it to someone for debuging :)
The last command seems to be:
[16:48:00.206412, cs, 2849623] [pid: 2849622] execution: {"executable":"/bin/bash","arguments":["/bin/bash","-c","test -f config.h || make \"AR_FLAGS=rc\" \"CC_FOR_BUILD=gcc\" \"CFLAGS=-g -O2 -m32\" \"CXXFLAGS=-
g -O2 -D_GNU_SOURCE -m32\" \"CFLAGS_FOR_BUILD=-g -O2\" \"CFLAGS_FOR_TARGET=-g -O2\" \"INSTALL=/usr/bin/install -c\" \"INSTALL_DATA=/usr/bin/install -c -m 644\" \"INSTALL_PROGRAM=/usr/bin/install -c\" \"INSTALL_S
CRIPT=/usr/bin/install -c\" \"JC1FLAGS=\" \"LDFLAGS=-m32\" \"LIBCFLAGS=-g -O2 -m32\" \"LIBCFLAGS_FOR_TARGET=-g -O2\" \"MAKE=make\" \"MAKEINFO=makeinfo --split-size=5000000 \" \"PICFLAG=\"
...
Thanks @dkm for this update.
Will test if that command alone can cause the build stuck.
But what I find strange is the output you've pasted here reports the whole build process was finished. The intercept
and citnames
processes are finished. And even bear
finished, which was calling these processes. (The succeeded with...
lines are literally the last lines in the main
function of these tools.)
Oh, maybe some thread was started with incorrect parameter and is preventing the process to finish until it is joined?