passenger
passenger copied to clipboard
Crash in Passenger::removeDirTree on logrotate
Issue report
hi Today we had a crash of PassengerAgent during the daily logrotate for apache2
/var/log/apache2/error.log
{
rotate 60
daily
maxsize 268435456
missingok
notifempty
compress
sharedscripts
postrotate
/bin/systemctl reload apache2 > /dev/null
endscript
}
Backtrace:
Core was generated by `Passenger core '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005582e89fd0fa in Passenger::Agent::Fundamentals::dumpBacktrace (state=..., userData=0x0) at src/agent/Shared/Fundamentals/AbortHandler.cpp:595
595 src/agent/Shared/Fundamentals/AbortHandler.cpp: No such file or directory.
[Current thread is 1 (Thread 0x7f7d17fff700 (LWP 3368))]
(gdb) bt
#0 0x00005582e89fd0fa in Passenger::Agent::Fundamentals::dumpBacktrace (state=..., userData=0x0) at src/agent/Shared/Fundamentals/AbortHandler.cpp:595
#1 0x00005582e89fc09d in Passenger::Agent::Fundamentals::runInSubprocessWithTimeLimit (state=..., callback=0x5582e89fcd8b <Passenger::Agent::Fundamentals::dumpBacktrace(Passenger::Agent::Fundamentals::AbortHandlerWorkingState&, void*)>, userData=0x0, timeLimit=4000)
at src/agent/Shared/Fundamentals/AbortHandler.cpp:294
#2 0x00005582e89fdb54 in Passenger::Agent::Fundamentals::dumpDiagnostics (state=...) at src/agent/Shared/Fundamentals/AbortHandler.cpp:746
#3 0x00005582e89ff073 in Passenger::Agent::Fundamentals::abortHandler (signo=6, info=0x7f7d157f5fb0, _unused=0x7f7d157f5e80) at src/agent/Shared/Fundamentals/AbortHandler.cpp:1085
#4 <signal handler called>
#5 0x00007f7d20b59fff in raise () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00007f7d20b5b42a in abort () from /lib/x86_64-linux-gnu/libc.so.6
#7 0x00007f7d214720ad in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007f7d21470066 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007f7d2146f089 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007f7d2146f9dd in __gxx_personality_v0 () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#11 0x00007f7d20ed5f33 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#12 0x00007f7d20ed6437 in _Unwind_Resume () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#13 0x00005582e89a690f in Passenger::removeDirTree (path="/tmp/passenger.spawn.XXXXS2B3C0") at src/cxx_supportlib/FileTools/FileManip.cpp:311
#14 0x00005582e8aba791 in Passenger::SpawningKit::HandshakeWorkDir::~HandshakeWorkDir (this=0x7f7cdc001860, __in_chrg=<optimized out>) at src/agent/Core/SpawningKit/Handshake/WorkDir.h:81
#15 0x00005582e8b2f413 in boost::checked_delete<Passenger::SpawningKit::HandshakeWorkDir> (x=0x7f7cdc001860) at src/cxx_supportlib/vendor-modified/boost/core/checked_delete.hpp:34
#16 0x00005582e8b0ed6f in boost::scoped_ptr<Passenger::SpawningKit::HandshakeWorkDir>::~scoped_ptr (this=0x7f7d157f8560, __in_chrg=<optimized out>) at src/cxx_supportlib/vendor-modified/boost/smart_ptr/scoped_ptr.hpp:88
#17 0x00005582e8abae39 in Passenger::SpawningKit::HandshakeSession::~HandshakeSession (this=0x7f7d157f8550, __in_chrg=<optimized out>) at src/agent/Core/SpawningKit/Handshake/Session.h:91
#18 0x00005582e8ad6544 in Passenger::SpawningKit::SmartSpawner::spawn (this=0x7f7d0c007ca0, options=...) at src/agent/Core/SpawningKit/SmartSpawner.h:1299
#19 0x00005582e8c043bb in Passenger::ApplicationPool2::Group::spawnThreadRealMain (this=0x7f7d0c007240, spawner=..., options=..., restartsInitiated=0) at src/agent/Core/ApplicationPool/Group/SpawningAndRestarting.cpp:107
#20 0x00005582e8c03d71 in Passenger::ApplicationPool2::Group::spawnThreadMain (this=0x7f7d0c007240, self=..., spawner=..., options=..., restartsInitiated=0) at src/agent/Core/ApplicationPool/Group/SpawningAndRestarting.cpp:53
#21 0x00005582e8c51986 in boost::_mfi::mf4<void, Passenger::ApplicationPool2::Group, boost::shared_ptr<Passenger::ApplicationPool2::Group>, boost::shared_ptr<Passenger::SpawningKit::Spawner>, Passenger::ApplicationPool2::Options, unsigned int>::operator() (this=0x7f7d0c031fa0, p=0x7f7d0c007240, a1=..., a2=...,
a3=..., a4=0) at src/cxx_supportlib/vendor-modified/boost/bind/mem_fn_template.hpp:506
#22 0x00005582e8c4de90 in boost::_bi::list5<boost::_bi::value<Passenger::ApplicationPool2::Group*>, boost::_bi::value<boost::shared_ptr<Passenger::ApplicationPool2::Group> >, boost::_bi::value<boost::shared_ptr<Passenger::SpawningKit::Spawner> >, boost::_bi::value<Passenger::ApplicationPool2::Options>, boost::_bi::value<unsigned int> >::operator()<boost::_mfi::mf4<void, Passenger::ApplicationPool2::Group, boost::shared_ptr<Passenger::ApplicationPool2::Group>, boost::shared_ptr<Passenger::SpawningKit::Spawner>, Passenger::ApplicationPool2::Options, unsigned int>, boost::_bi::list0> (this=0x7f7d0c031fb0, f=..., a=...)
at src/cxx_supportlib/vendor-modified/boost/bind/bind.hpp:531
#23 0x00005582e8c4990d in boost::_bi::bind_t<void, boost::_mfi::mf4<void, Passenger::ApplicationPool2::Group, boost::shared_ptr<Passenger::ApplicationPool2::Group>, boost::shared_ptr<Passenger::SpawningKit::Spawner>, Passenger::ApplicationPool2::Options, unsigned int>, boost::_bi::list5<boost::_bi::value<Passenger::ApplicationPool2::Group*>, boost::_bi::value<boost::shared_ptr<Passenger::ApplicationPool2::Group> >, boost::_bi::value<boost::shared_ptr<Passenger::SpawningKit::Spawner> >, boost::_bi::value<Passenger::ApplicationPool2::Options>, boost::_bi::value<unsigned int> > >::operator() (this=0x7f7d0c031fa0)
at src/cxx_supportlib/vendor-modified/boost/bind/bind.hpp:1294
#24 0x00005582e8c44693 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf4<void, Passenger::ApplicationPool2::Group, boost::shared_ptr<Passenger::ApplicationPool2::Group>, boost::shared_ptr<Passenger::SpawningKit::Spawner>, Passenger::ApplicationPool2::Options, unsigned
int>, boost::_bi::list5<boost::_bi::value<Passenger::ApplicationPool2::Group*>, boost::_bi::value<boost::shared_ptr<Passenger::ApplicationPool2::Group> >, boost::_bi::value<boost::shared_ptr<Passenger::SpawningKit::Spawner> >, boost::_bi::value<Passenger::ApplicationPool2::Options>, boost::_bi::value<unsigned int>
> >, void>::invoke (function_obj_ptr=...) at src/cxx_supportlib/vendor-modified/boost/function/function_template.hpp:158
#25 0x00005582e89b11c0 in boost::function0<void>::operator() (this=0x7f7d0c0015a0) at src/cxx_supportlib/vendor-modified/boost/function/function_template.hpp:764
#26 0x00005582e8aa57fb in oxt::dynamic_thread_group::thread_main (this=0x7f7d0c0072b0, data=0x7f7d0c0015a0) at src/cxx_supportlib/oxt/dynamic_thread_group.hpp:131
#27 0x00005582e8b9cbe8 in boost::_mfi::mf1<void, oxt::dynamic_thread_group, oxt::dynamic_thread_group::thread_func_data*>::operator() (this=0x7f7cdc0008e0, p=0x7f7d0c0072b0, a1=0x7f7d0c0015a0) at src/cxx_supportlib/vendor-modified/boost/bind/mem_fn_template.hpp:165
#28 0x00005582e8b8dde3 in boost::_bi::list2<boost::_bi::value<oxt::dynamic_thread_group*>, boost::_bi::value<oxt::dynamic_thread_group::thread_func_data*> >::operator()<boost::_mfi::mf1<void, oxt::dynamic_thread_group, oxt::dynamic_thread_group::thread_func_data*>, boost::_bi::list0> (this=0x7f7cdc0008f0, f=...,
a=...) at src/cxx_supportlib/vendor-modified/boost/bind/bind.hpp:319
#29 0x00005582e8b7cbdb in boost::_bi::bind_t<void, boost::_mfi::mf1<void, oxt::dynamic_thread_group, oxt::dynamic_thread_group::thread_func_data*>, boost::_bi::list2<boost::_bi::value<oxt::dynamic_thread_group*>, boost::_bi::value<oxt::dynamic_thread_group::thread_func_data*> > >::operator() (this=0x7f7cdc0008e0)
at src/cxx_supportlib/vendor-modified/boost/bind/bind.hpp:1294
#30 0x00005582e8b64e42 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf1<void, oxt::dynamic_thread_group, oxt::dynamic_thread_group::thread_func_data*>, boost::_bi::list2<boost::_bi::value<oxt::dynamic_thread_group*>, boost::_bi::value<oxt::dynamic_thread_group::thread_func_data*> > >, void>::invoke (function_obj_ptr=...) at src/cxx_supportlib/vendor-modified/boost/function/function_template.hpp:158
#31 0x00005582e89b11c0 in boost::function0<void>::operator() (this=0x7f7d157f9e40) at src/cxx_supportlib/vendor-modified/boost/function/function_template.hpp:764
#32 0x00005582e8cd2330 in oxt::thread::thread_main(boost::function<void ()>, boost::shared_ptr<oxt::thread_local_context>) (func=..., ctx=...) at src/cxx_supportlib/oxt/implementation.cpp:416
#33 0x00005582e89921c1 in boost::_bi::list2<boost::_bi::value<boost::function<void ()> >, boost::_bi::value<boost::shared_ptr<oxt::thread_local_context> > >::operator()<void (*)(boost::function<void ()>, boost::shared_ptr<oxt::thread_local_context>), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(boost::function<void ()>, boost::shared_ptr<oxt::thread_local_context>), boost::_bi::list0&, int) (a=<synthetic pointer>..., f=<optimized out>, this=<optimized out>) at src/cxx_supportlib/vendor-modified/boost/bind/bind.hpp:319
#34 boost::_bi::bind_t<void, void (*)(boost::function<void ()>, boost::shared_ptr<oxt::thread_local_context>), boost::_bi::list2<boost::_bi::value<boost::function<void ()> >, boost::_bi::value<boost::shared_ptr<oxt::thread_local_context> > > >::operator()() (this=<optimized out>)
at src/cxx_supportlib/vendor-modified/boost/bind/bind.hpp:1294
#35 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(boost::function<void ()>, boost::shared_ptr<oxt::thread_local_context>), boost::_bi::list2<boost::_bi::value<boost::function<void ()> >, boost::_bi::value<boost::shared_ptr<oxt::thread_local_context> > > > >::run() (this=<optimized out>)
at src/cxx_supportlib/vendor-modified/boost/thread/detail/thread.hpp:117
#36 0x00005582e8cc67da in boost::(anonymous namespace)::thread_proxy (param=0x7f7d0c031b70) at src/cxx_supportlib/vendor-modified/boost/libs/thread/src/pthread/thread.cpp:177
#37 0x00007f7d224ba4a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#38 0x00007f7d20c0fd0f in clone () from /lib/x86_64-linux-gnu/libc.so.6
Question 2: Passenger version and integration mode:
- passenger 1:6.0.4-1~stretch1 and libapache2-mod-passenger 1:6.0.4-1~stretch1
Question 3: OS or Linux distro, platform (including version):
- Debian 9.11
Question 4: Passenger installation method: [x] Phusion APT repo
Question 5: Your app's programming language (including any version managers) and framework (including versions):
- ruby 2.3 running redmine.
Question 6: Are you using a PaaS and/or containerization? If so which one?
- no
I think the logrotate might only be relevant in that it causes apache to relaod the config and therefore restarts passenger.
it looks like passenger is trying to clean up its spawning directory, but it's already gone. Is your /tmp dir regularly cleaned out?
We have no tmpreaper or such running on /tmp, but the debian default apache2.service is using PrivateTmp=true, could that have something to do with it?
Yes, that could be related. Does a directory matching the pattern: /tmp/passenger.spawn.******** get created at all when you start apache? If not, does such a directory get created in apache's private tmp dir?
No, even after restarting apache /tmp appears to be empty.
nsenter -t 23186 -m ls /tmp/
23186 is the pid of the passenger core process.
However we do set PassengerInstanceRegistryDir to /run/passenger and /run/passenger contains passenger.Bd2nqTe. Is this the correct directory?
No that's a different dir, the dir you're looking for has the .spawn.
bit in the middle.
$ systemctl restart apache2
$ nsenter -t 12338 -m find / -name '*.spawn.*'
and nothing was found.
Can you check if systemd is starting apache with a private tmp dir?
Is there any doubt?
$ ls -l /proc/{1,8735}/ns/mnt
lrwxrwxrwx 1 root root 0 Feb 10 09:11 /proc/1/ns/mnt -> mnt:[4026531840]
lrwxrwxrwx 1 root root 0 Feb 10 09:11 /proc/8735/ns/mnt -> mnt:[4026532229]
I would take that as yes it is starting apache with a private tmp dir.
I'm also interested in this. We've had apache/passenger running in production with our current setup for years. Just today we had a decent outage on production because I think of this same error.
dmesg
shows a segmentation fault at exactly the time we do our logrotate (unfortunately I couldn't find the core file, even though looking in /etc/apache2
where it said it might be found.
Interestingly, it was only the following day when logrotate happened again that the real trouble started and we ended up getting a bunch of these errors:
[ E 2020-03-04 06:25:21.0225 32376/T0 apa/Hooks.cpp:751 ]: Unexpected error in mod_passenger: Cannot connect to the Passenger core at unix:/var/passenger/passenger.0Oh8G1l/agents.s/core
Backtrace:
in 'Passenger::FileDescriptor Passenger::Apache2Module::Hooks::connectToCore()' (Hooks.cpp:343)
in 'int Passenger::Apache2Module::Hooks::handleRequest(request_rec*)' (Hooks.cpp:622)
Anyway, maybe this is unrelated, but, maybe it's of help. We've had two of these segfaults, the first one on Feb 20th.
I've also been unable to find the spawn
directories, we also set our own PASSENGER_INSTANCE_REGISTRY_DIR
.
would a possible fix/workaround (specifically for the logrotate issue) be instead of issuing a reload
to issue a restart
?
default logrotate for apache
postrotate
if invoke-rc.d apache2 status > /dev/null 2>&1; then \
invoke-rc.d apache2 reload > /dev/null 2>&1; \
fi;
change to
postrotate
if invoke-rc.d apache2 status > /dev/null 2>&1; then \
invoke-rc.d apache2 restart > /dev/null 2>&1; \
fi;
Afaik even on an apache2 reload the passenger process gets restarted. So it unlikely to help.
Ok can someone test the fix here: https://github.com/phusion/passenger/tree/Fix_privatetmp_crash It seems to work for me, but i've had difficulty reproducing the original crash reliably so it might just be luck.
Anybody?
Sorry @CamJN, this happens quite irregularly for us (and only on prod) and I don't want to test the fix in production.
While this is absolutely not a fix for the underlying issue, I changed our /etc/logrotate.d/apache2
to use the copytruncate
strategy instead of reloading apache itself (which was what triggered this issue). There is apparently some downsides to this, it's possible that you'll lose a few log entries, but for us it's far more important that we don't have this issue meaning we're serving 500s for a period of time.
You can read more about the copytruncate strategy here
[ E 2020-03-04 06:25:21.0225 32376/T0 apa/Hooks.cpp:751 ]: Unexpected error in mod_passenger: Cannot connect to the Passenger core at unix:/var/passenger/passenger.0Oh8G1l/agents.s/core Backtrace: in 'Passenger::FileDescriptor Passenger::Apache2Module::Hooks::connectToCore()' (Hooks.cpp:343) in 'int Passenger::Apache2Module::Hooks::handleRequest(request_rec*)' (Hooks.cpp:622)
Very much the same problem here
@claasz are you using 6.0.8 or later? Are you using systemd's PRIVATE_TMP
?
@claasz are you using 6.0.8 or later? Are you using systemd's
PRIVATE_TMP
?
@CamJN We are using 6.0.7. I have no idea what systemd's PRIVATE_TMP
is. How would I check it?
@claasz This might help you determine if you are using systemd private tmp: https://access.redhat.com/discussions/3027351
@claasz This might help you determine if you are using systemd private tmp: https://access.redhat.com/discussions/3027351
@CamJN I see. So yes, we are using systemd private tmp.
Ok, well I added a possible fix in 6.0.8, if you try that or 6.0.9 let me know if it works for you.
@CamJN Thanks. However, we've already applied @patrickdavey 's copytruncate
workaround. It's a production system which showed the problem quite infrequently, so it's not possible for us to test the fix.
I have the same problem for over a very long time and are not able to solve it. Anyone with a solution for this?
@patrickdavey Do you have an example of your copytruncate workaround? Or do you already have another fix in the meantime?
This is the contents of our /etc/logrotate.d/apache2
file
/var/log/apache2/*.log {
daily
missingok
rotate 20
compress
delaycompress
notifempty
create 640 root adm
copytruncate
sharedscripts
}
Yours may be slightly different, but, it's the copytruncate
line you need @royjoling. And no, we didn't find a different fix, once the copytruncate worked that was enough (and it was impossible to reproduce anyway, and I'm afraid we don't have the resources to play around with this stuff in production).
I finally found a ticket about this issue that has been plaguing me for half a year 😭. This happens exactly 24 hours after the server was booted up, and it didn't always happen.
Been trying a few things, including changing the PassengerInstanceRegistryDir
which didn't fix the issue. Also updated to 6.0.14 and it still happens.
I'll be trying the copytruncate
fix next.
Update: the fix doesn't work.