dhcpcd icon indicating copy to clipboard operation
dhcpcd copied to clipboard

Prevent processing of network event after SIGTERM executed

Open Sime-Zupanovic opened this issue 2 months ago • 2 comments

We detected root cause for most recent dhcpcd program crashes that happened on dhcpcd --exit resulting in handling SIGTERM. Basically problem was in eloop_start() where exitnow was checked to late if SIGTERM was process, resulting in entering eloop_run_ppoll() -> ppoll after stop_all_interfaces() already done.

int eloop_start(struct eloop *eloop, sigset_t *signals) { int error; struct eloop_timeout *t; struct timespec ts, *tsp;

assert(eloop != NULL);

#ifdef HAVE_KQUEUE UNUSED(signals); #endif

for (;;) {
	**if (eloop->exitnow)
		break;**

#ifndef HAVE_KQUEUE if (_eloop_nsig != 0) { int n = _eloop_sig[--_eloop_nsig];

		if (eloop->signal_cb != NULL)
			eloop->signal_cb(n, eloop->signal_cb_ctx);
		continue;
	}

#endif .. error = eloop_run_ppoll(eloop, tsp, signals);

As a consequence sometimes we would detect some network event over ppoll() and still call corresponding callback e.g. REPLY6 after SIGTERM already handled. We suggest moving this check:

	**if (eloop->exitnow)
		break;**

just below if (_eloop_nsig != 0)

So once SIGTERM callback dhcpcd_signal_cb() is done and we are back in eloop_start() we should just exit and prevent falling through in ppoll() call anymore.

Sime-Zupanovic avatar Oct 03 '25 10:10 Sime-Zupanovic

OK, this is not the root cause, this is just a workaround.

But basically we need to deal with this properly elsewhere as DHCPv6 release (SIGALRM vs SIGTERM/SIGINT) means that we do expect an acknowledgement packet.

rsmarples avatar Oct 06 '25 12:10 rsmarples

We will use your patch from https://github.com/NetworkConfiguration/dhcpcd/pull/536 with additional small fix in dhcpcd_handlecarrier() as explained in ticket 536.

Sime-Zupanovic avatar Oct 14 '25 08:10 Sime-Zupanovic