linux icon indicating copy to clipboard operation
linux copied to clipboard

Raspberry Pi 4: DHCPCD route socket overflowed

Open jwillmer opened this issue 3 years ago • 24 comments

Describe the bug Every now and then my Pi is loosing it's IPv6. I found out that I can fix the issue temporarily via systemctl restart dhcpcd. Today it happened again and I used systemctl status dhcpcd to look at the state. I got the following output:

Warning: The unit file, source configuration file or drop-ins of dhcpcd.service changed on disk. Run 'systemctl daemon-reload' to reload units.
● dhcpcd.service - dhcpcd on all interfaces
   Loaded: loaded (/lib/systemd/system/dhcpcd.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/dhcpcd.service.d
           └─wait.conf
   Active: failed (Result: signal) since Fri 2021-01-22 15:36:35 CET; 1 day 6h ago
  Process: 340 ExecStart=/usr/lib/dhcpcd5/dhcpcd -q -w (code=exited, status=0/SUCCESS)
 Main PID: 484 (code=killed, signal=SEGV)

Jan 22 15:36:35 home-server dhcpcd[484]: veth2c6abe3: waiting for carrier
Jan 22 15:36:35 home-server dhcpcd[484]: vethf713b46: IAID 69:de:ae:f1
Jan 22 15:36:35 home-server dhcpcd[484]: vethf713b46: adding address fe80::e7........7:52f9
Jan 22 15:36:35 home-server dhcpcd[484]: veth88ef6b4: waiting for carrier
Jan 22 15:36:35 home-server dhcpcd[484]: veth514e931: waiting for carrier
Jan 22 15:36:35 home-server dhcpcd[484]: route socket overflowed - learning interface state
Jan 22 15:36:35 home-server dhcpcd[484]: vethf32a0ec: carrier acquired
Jan 22 15:36:35 home-server dhcpcd[484]: vethf32a0ec: IAID bf:44:26:9c
Jan 22 15:36:35 home-server systemd[1]: dhcpcd.service: Main process exited, code=killed, status=11/SEGV
Jan 22 15:36:35 home-server systemd[1]: dhcpcd.service: Failed with result 'signal'.

I don't have enough knowledge about Linux to say that this is the right channel for this issue. Please be kind and redirect me if this issue is completely off topic.

To reproduce I don't know. I can't find a pattern, it just happens now and then.

System

  • Which model of Raspberry Pi? Raspberry Pi 4

  • Which OS and version (cat /etc/rpi-issue)?

Raspberry Pi reference 2020-08-20
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 9a3a10bf1019ebb2d59053564dc6b90068bad27d, stage2
  • Which firmware version (vcgencmd version)?
Jan  7 2021 18:27:29
Copyright (c) 2012 Broadcom
version fb345a0c2d5544957f4ba1a2b9e968970e3312c4 (clean) (release) (start)
  • Which kernel version (uname -a)? Linux home-server 5.10.6-v7l+ #1393 SMP Mon Jan 11 15:09:41 GMT 2021 armv7l GNU/Linux

Additional context I am only running docker containers on the Pi. I boot the OS from an SSD drive but this I did only recently and I had the issue before as well.

jwillmer avatar Jan 23 '21 20:01 jwillmer

I have a similar problem. It seems to be, that dhcpcd in the raspberry pi OS Version, which is:

dhcpcd 8.1.2
Copyright (c) 2006-2019 Roy Marples
Compiled in features: INET ARP ARPing IPv4LL INET6 DHCPv6 AUTH

Fails if there are to many network interfaces. It Seems like you also have docker running on the pi.

On my side, the dhcpcd and dhcpcd5 daemon will not come up on a fresh reboot. I have about 10 docker containers running in several docker-compose projects. If I shut down the compose projects before rebooting the Pi (its a Pi4) then everything is working fine and the daemon is coming up.

I read in another forum, that this should be fixed in a newer dhcpcd Version and it is a known bug. So dhcpcd fails if there are too many network interfaces.

Perhaps somebody has a workaround or even a fix for that.

alaub81 avatar Jan 24 '21 17:01 alaub81

@alaub81 do you have a link to that issue so that I can track the progress?

jwillmer avatar Jan 24 '21 20:01 jwillmer

@jwillmer I just read it here: forums.gentoo.org

alaub81 avatar Jan 25 '21 05:01 alaub81

Fastest workaround: sudo nano /etc/dhcpcd.conf Insert the following line at the end: denyinterfaces veth*

It exclude the virtual container interfaces from dhcpcd.

FP2K-Minske avatar Feb 06 '21 17:02 FP2K-Minske

@FP2K-Minske thank you, just tried it right now and it seems to work :-)

alaub81 avatar Feb 06 '21 18:02 alaub81

I had the same exact issue with Docker and denying veth interfaces solved the issue for me! Much appreciated ;)

theunreal89 avatar May 07 '21 10:05 theunreal89

Here to confirm that the FP2K's workaround also worked for me. I was tracking this problem and I can say this solution also works.

In addition, I was observing that this problem occurs every time the DHCP lease duration expires (4h by default). So dhcpd service crashes and the Raspberry Pi became offline but stays powered on. I have the exact same scenario: multiples docker interfaces. Fortunately the two mentioned workarounds above fix this thing.

EDIT: After @cpannwitz's comment below, I have to clarify. I've tested both solutions I mentioned above individually. I didn't applied both simultaneously.

daniel-asilva avatar May 09 '21 19:05 daniel-asilva

In Addition to the fix by @FP2K-Minske and @daniel-asilva (both fixes applied), I had to restart daemon and dhcpcd:

sudo systemctl daemon-reload
sudo systemctl restart dhcpcd

afterwards, because there were complaints about changed conf files on disk, which resulted in the same problem, dhcpcd not working after reboot.

EDIT: In my case, applying BOTH fixes (see above) did NOT work. I had to remove the fix posted by @daniel-asilva , and had to move denyinterfaces veth* to the top of the /etc/dhcpcd.conf file.

cpannwitz avatar May 17 '21 06:05 cpannwitz

This happened to me from the time I set wlan0 static IP from GUI.

I see it happening the same thing from this thread https://raspberrypi.stackexchange.com/questions/58809/rpi-loses-its-wlan0-configuration-when-any-docker-container-is-started/117381#117381

It was solved disabling DHCP for virtual interfaces with the denyinterfaces veth* trick on /etc/dhcpcd.conf. Make sure you add it to the top of the file and reboo. Otherwise it won't work.

But I suspect when you set IP only, it looks for the resto configuration over all networks including veth new ones.

I will try to confirm this issue setting all static config requested on GUI and see what happens.

This is so crazy I was getting TLS and socket resset errors in my stack, and I was thinking for a month it was my stack issue.

moracabanas avatar Mar 19 '22 01:03 moracabanas

and I was thinking for a month it was my stack issue.

Just for a month? :P I've been having network crashes (same symptoms, docker swarm cluster) for well over a year, and going back even to raspberry pi 3 kernels and I could never pinpoint this in any way. System logs were very ambiguous and since I run them headless I always assumed they crashed until I realised I could still access them over another address/IP protocol (I run IPv6 and 2x attached VLANs on each raspi so technically 4x addresses per each Pi - 2x IPv4, 1x IPv6 public and 1xIPv6 ULA)

If this actually works I'll be ecstatic :D Thanks for posting this workaround folks!

d-rez avatar Mar 26 '22 14:03 d-rez

This happened to me from the time I set wlan0 static IP from GUI.

I see it happening the same thing from this thread https://raspberrypi.stackexchange.com/questions/58809/rpi-loses-its-wlan0-configuration-when-any-docker-container-is-started/117381#117381

It was solved disabling DHCP for virtual interfaces with the denyinterfaces veth* trick on /etc/dhcpcd.conf. Make sure you add it to the top of the file and reboo. Otherwise it won't work.

There was another Q posted to RPi recently that involved strange issues with docker services. I don't use docker services, and would have ignored the question except that the title of the Q implied network issues. I eventually gave a rather elaborate and tutorial answer that was primarily to make this point: Do not use dhcpcd's static ip option.

This shouldn't be controversial (or so I thought) as the author of dhcpcd says in man dhcpcd.conf:

For IPv4, you should use the inform ipaddress option instead of setting a static address.

The OP didn't provide any feedback; I don't know if he resolved his issue or not. But I ran across this thread, and wanted to ask a question, hopefully to get some feedback.

In the first line of this quote, it seems that you are using the static ip option, and so my question is this: Instead of static ip, have you tried either the request or inform options? If so, did that have any effect on the docker issues?

seamusdemora avatar Apr 04 '22 18:04 seamusdemora

Encountered same problem and after a week of searching, I finally found the answer here. seems that this issue is still not get fixed.

I have two raspberry 4 with Raspberry Pi OS(64 bit) installed, have docker running in both, and both lost it ethernet connection after 2~3 days of poweron. and I have to manually reboot it every time to get it recovery...

pi@rasp-2:log $ uname -a
Linux rasp-2 5.15.32-v8+ #1538 SMP PREEMPT Thu Mar 31 19:40:39 BST 2022 aarch64 GNU/Linux
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth4bd92c9: soliciting an IPv6 router
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth73f0901: waiting for carrier
Jul 23 15:16:28 rasp-2 dhcpcd[721]: vethb3ac390: soliciting a DHCP lease
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth91f9038: soliciting a DHCP lease
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth576f779: waiting for carrier
Jul 23 15:16:28 rasp-2 dhcpcd[721]: route socket overflowed - learning interface state
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth7777561: carrier acquired

ykun91 avatar Jul 25 '22 12:07 ykun91

I'd like to note that this is still an active issue, and the fix mentioned in https://github.com/raspberrypi/linux/issues/4092/#issuecomment-774512217 does resolve it. I wish it hadn't taken me weeks to find this thread, but very happy I did. I'm curious if this also persists on alternate distros like DietPi

eric-pierce avatar Aug 23 '22 23:08 eric-pierce

Same here - thanks @FP2K-Minske - doesn't even feel very hacky - simply telling dhcpcd not to do something that it probably sensibly tries to do by default.

bfren avatar Aug 24 '22 06:08 bfren

Been experiencing a similar problem too. Headless Raspberry Pi 4 with 10 docker containers (and corresponding veth* interfaces). Was regularly loosing connectivity on eth0 after a couple of weeks of uptime. Only suspicious thing I could find in the logs is the mentioned "DHCPCD route socket overflowed" message.

I will try the "denyinterfaces" option for dhcpcd 🤞.

denwald avatar Dec 02 '22 17:12 denwald

Maybe updating dhcpcd to a version >= 9.2.0 could also help. There are a few interesting notes in the changelog of that version that seem related...

The latest dhcpcd version available on Raspberry Pi OS is 8.1.2.

ferrarimarco avatar Feb 16 '23 20:02 ferrarimarco

Just wanted to register another vote to bring dhcpcd up to a version more recent than 2019, as there have been a lot of improvements since then.

dulitz avatar May 27 '23 15:05 dulitz

I ended up using raspi-config to switch to NetworkManager on all my pis instead of dhcpd - not had a problem since.

bfren avatar May 28 '23 04:05 bfren

This bug is especially strange since I don't use IPv6 anywhere, but it seems like 6+ independent containers create the conditions for overflow.

doodlebro avatar Jun 15 '23 14:06 doodlebro

The issue is still perisistent. I realized that after enabling IPv6 in my network and going with docker ~10 containers. Initially I thought there's a DHCP issue on my router (too narrow DHCP lease time), but it's like in this thread - at some point dhcpd is giving up not renewing DHCP leases regardless of how the DHCP server is being configured. It took me several hours to debug the matter with dhcpd and find this tread. It's a very confusing kind of error!

Meanwhile, I'm using what @bfren has proposed above - using newtork-manager instead of dhcpd.

Dhcpd should be either updated or network-manager should be the default in the OS. Otherwise an user is going to be faced with strange networking issues that are hard to troubleshoot whenever wanting to do some more serious work with Pi and networking :-(

areksobiczewski avatar Mar 08 '24 16:03 areksobiczewski

@areksobiczewski , et al

Please note that issues with dhcpcd are likely impacted by the fact that the RPi powers-zat-bee decided some time ago to stick with an old, no-longer-maintained-upstream version of dhcpcd. That left all bug-fixes for dhcpcd as the responsibility of someone in the RPi organization - or maybe a volunteer?? At any rate - in my experience, no one seemed to give a rat's-a$$ if it was maintained or not.

That's not meant as criticism, but only as a plain statement of fact.

seamusdemora avatar Mar 08 '24 22:03 seamusdemora

Indeed. The decision has been made, for whatever reasons (however good - presumably there are consequences to using later versions of dhcpd?). What I don't understand is why Network Manager isn't simply made the default - are there consequences to using it that I'm not aware of?

If not, given doing that would easily fix the strange and definitely hard to troubleshoot issues caused by having relatively few Docker containers, I don't see why it hasn't been done.

bfren avatar Mar 09 '24 09:03 bfren

It seems that the release notes for the latest Raspberry Pi OS version (based on Bookworm), contain this line:

  * NetworkManager used instead of dhcpcd as networking interface; various changes made to networking plugin to support this

ferrarimarco avatar Mar 09 '24 10:03 ferrarimarco

@ferrarimarco that is curious, when I used the Bookworm image to install a new Pi 5, the default was still dhcpd and I had to change it using raspi-config.

bfren avatar Mar 09 '24 16:03 bfren