needrestart icon indicating copy to clipboard operation
needrestart copied to clipboard

fix "Can't cd to (unreachable)" error

Open odenbach opened this issue 6 years ago • 7 comments

Hi,

one of our Debian stretch machines made needrestart crash with the error

root@bilbo[~]# needrestart -v [main] eval /etc/needrestart/needrestart.conf [main] needrestart v3.3 [main] running in root mode [Core] Using UI 'NeedRestart::UI::stdio'... [main] systemd detected [Core] #598 is a NeedRestart::Interp::Python [Python] #598: source=/usr/local/sbin/bcfg2-push [Core] #11905 is a NeedRestart::Interp::Perl [Perl] #11905: source=/usr/local/bin/apache-vsl Can't cd to (unreachable)/etc/apache2: No such file or directory

After some digging I found that just before the call to 'scan_deps' the working dir is "(unreachable)/something". A simple chdir ($cwd) just before the call fixed this.

The issues #55 and #72 look quite similar, maybe this patch would have fixed them as well.

Thanks,

Christopher

odenbach avatar Aug 28 '18 15:08 odenbach

Hi,

the crash happens within Module::ScanDeps and is triggered if /proc/PID/root is unreachable. Your fix breaks the scan_deps function for perl modules loaded from relative paths.

Maybe scan_deps should be nested inside eval since there seems no full relailable way to prevent crashes.

liske avatar Aug 29 '18 17:08 liske

We too have a machine that occasionally just prints Can't cd to (unreachable)/root: No such file or directory

Is there any additional information we could provide to help fix this?

mphilipps avatar Nov 05 '18 10:11 mphilipps

I'm currently missing a way to trigger the problem for debugging/fixing/testing purpose. What process did trigger the bug @mphilipps ?

liske avatar Nov 05 '18 21:11 liske

I am not sure, I can't 100% reliably trigger it. Usually I am able to trigger it by calling needrestart a couple of times in a row. I am fairly certain that it doesn't depend on any actually requiring a a restart, but it might depend on process running at the same time.


 needrestart -v
[main] eval /etc/needrestart/needrestart.conf
[main] needrestart v3.3.1
[main] running in root mode
[Core] Using UI 'NeedRestart::UI::stdio'...
[main] systemd detected
[Core] #1905 is a NeedRestart::Interp::Perl
[Perl] #1905: source=/usr/sbin/munin-node
[Core] #1914 is a NeedRestart::Interp::Python
[Python] #1914: source=/usr/bin/fail2ban-server
Use of uninitialized value $_[0] in exec at /usr/share/perl5/NeedRestart/Utils.pm line 199.
Can't exec "": No such file or directory at /usr/share/perl5/NeedRestart/Utils.pm line 199.
[Python] #1914: failed to retrieve include path
[Core] #2626 is a NeedRestart::Interp::Perl
[Perl] #2626: could not get current working directory, skipping
[Core] #2629 is a NeedRestart::Interp::Perl
[Perl] #2629: could not get current working directory, skipping
[Core] #2631 is a NeedRestart::Interp::Perl
[Perl] #2631: could not get current working directory, skipping
[Core] #3350 is a NeedRestart::Interp::Perl
[Perl] #3350: could not get current working directory, skipping
[Core] #3352 is a NeedRestart::Interp::Perl
[Perl] #3352: could not get current working directory, skipping
[Core] #3353 is a NeedRestart::Interp::Perl
[Perl] #3353: could not get current working directory, skipping
[Core] #4089 is a NeedRestart::Interp::Perl
[Perl] #4089: could not get current working directory, skipping
[Core] #4090 is a NeedRestart::Interp::Perl
[Perl] #4090: could not get current working directory, skipping
[Core] #4091 is a NeedRestart::Interp::Perl
[Perl] #4091: could not get current working directory, skipping
[Core] #5229 is a NeedRestart::Interp::Perl
[Perl] #5229: could not get current working directory, skipping
[Core] #5233 is a NeedRestart::Interp::Perl
[Perl] #5233: could not get current working directory, skipping
[Core] #5237 is a NeedRestart::Interp::Perl
[Perl] #5237: could not get current working directory, skipping
[Core] #5339 is a NeedRestart::Interp::Python
[Python] #5339: source=/usr/bin/fail2ban-server
[Python] #5339: use cached file list
[Core] #6380 is a NeedRestart::Interp::Perl
[Perl] #6380: could not get a source file, skipping
[Core] #6558 is a NeedRestart::Interp::Perl
[Perl] #6558: could not get a source file, skipping
[Core] #6569 is a NeedRestart::Interp::Perl
[Perl] #6569: could not get a source file, skipping
[Core] #6571 is a NeedRestart::Interp::Perl
[Perl] #6571: could not get a source file, skipping
[Core] #6573 is a NeedRestart::Interp::Perl
[Perl] #6573: could not get a source file, skipping
[Core] #6576 is a NeedRestart::Interp::Perl
[Perl] #6576: could not get a source file, skipping
[Core] #6584 is a NeedRestart::Interp::Perl
[Perl] #6584: could not get a source file, skipping
[Core] #6592 is a NeedRestart::Interp::Perl
[Perl] #6592: could not get a source file, skipping
[Core] #7608 is a NeedRestart::Interp::Perl
[Perl] #7608: could not get a source file, skipping
[Core] #8408 is a NeedRestart::Interp::Perl
[Perl] #8408: could not get current working directory, skipping
[Core] #8409 is a NeedRestart::Interp::Perl
[Perl] #8409: could not get current working directory, skipping
[Core] #8411 is a NeedRestart::Interp::Perl
[Perl] #8411: could not get current working directory, skipping
[Core] #10593 is a NeedRestart::Interp::Perl
[Perl] #10593: source=/usr/bin/ts
Can't cd to (unreachable)/root: No such file or directory

Is this of any help to you?

mphilipps avatar Nov 06 '18 10:11 mphilipps

This is triggered by systemd sandboxing and glibc < 2.2.28.

munin-node from stretch-backports has the following in munin-node.service:

PrivateDevices=false
PrivateTmp=true
ProtectHome=true
# "full" (instead of "strict") still allows write access to the state files
ProtectSystem=full

Any one of those protections change the mount namespace of the process, and this will result in

$ perl -MCwd -e 'chdir("/proc/3610/root/usr") or die; print getcwd(), "\n";'
(unreachable)/usr

Glibc 2.28 has this change which will return "correct" cwd if the kernel getcwd() returns path starting with '(unreachable)' - (or actually path not starting with slash).

https://github.com/bminor/glibc/commit/52a713fdd0a30e1bd79818e2e3c4ab44ddca1a94#diff-db4c20ba9c7120b79d0fbfbb8b754787

getcwd("(unreachable)/usr", 4095)       = 18
lstat(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
openat(AT_FDCWD, "..", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl(3, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
getdents64(3, /* 22 entries */, 32768)  = 552
newfstatat(3, "usr", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
close(3)                                = 0
write(1, "/usr\n", 5)                   = 5

I believe the following would have same result as the libc change will have, but will not of course work if the mount namespace has truly different root device etc. mounted.

--- /usr/share/perl5/NeedRestart/Interp/Perl.pm.orig    2018-12-18 07:41:38.733005997 +0200
+++ /usr/share/perl5/NeedRestart/Interp/Perl.pm 2018-12-18 07:42:45.822534003 +0200
@@ -57,7 +57,7 @@
        return ();
     }
     my $cwd = getcwd();
-    chdir("/proc/$pid/root/$ptable->{cwd}");
+    chdir(readlink("/proc/$pid/root") . "/$ptable->{cwd}");
 
     # get original ARGV
     (my $bin, local @ARGV) = nr_parse_cmd($pid);
@@ -104,7 +104,7 @@
        return ();
     }
     my $cwd = getcwd();
-    chdir("/proc/$pid/root/$ptable->{cwd}");
+    chdir(readlink("/proc/$pid/root") . "/$ptable->{cwd}");
 
     # skip the process if the cwd is unreachable (i.e. due to mnt ns)
     unless(getcwd()) {

asalmela avatar Dec 18 '18 06:12 asalmela

The patch above by @asalmela fixes this issue for me. I have debian stretch+backports and munin-node version 2.0.43-3~bpo9+1.

Before patch:

[Core] #5281 is a NeedRestart::Interp::Perl
[Perl] #5281: source=/usr/sbin/munin-node
Can't cd to (unreachable)/: No such file or directory

After patch:

[Core] #5281 is a NeedRestart::Interp::Perl
[Perl] #5281: source=/usr/sbin/munin-node
[Core] #9519 is a NeedRestart::Interp::Python
[Python] #9519: source=/usr/share/system-config-printer/applet.py

cjpeterein avatar Dec 26 '18 21:12 cjpeterein

Thanks @asalmela for digging into it. I still wonder if this should be patched in needrestart since it is a bug within 3rd party code.

For Debian the patch could be added to the bpo upload as a workaround. On Debian buster and later it should not be necessary since it is shipped using glibc 2.28+.

liske avatar Jan 28 '19 21:01 liske