[Pi 4B 8GB] When EEE (Energy Efficient Ethernet) is enable and active on eth0 at gigabit speed, ethernet becomes slow to unresponsive with packets dropping
Summary When Energy Efficient Ethernet is enabled and active from a new install of Raspberry Pi OS on a Pi 4B 8GB, gigabit ethernet transfer with the built-in ethernet adapter becomes slow and unresponsive with packets being dropped. Other computers supporting EEE plugged into the same network behave normally, even if they are tested in the same port that the Pi was connected. When the Pi is connected with EEE enabled, it also results in some odd behaviour on other devices connected to the same switch such as causing the other device to drop packets and/or reconnect constantly. Removing the Pi from the network causes the behaviour on the switch and other device to return to normal.
To reproduce When Pi 4B 8GB is connected to gigabit switch with default EEE behaviours, the network performance of the Pi deteriorates as above.
Status from ethtool --show-eee eth0:
EEE Settings for eth0:
EEE status: enabled - active
Tx LPI: inactive
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: 100baseT/Full
1000baseT/Full
Link partner advertised EEE link modes: 100baseT/Full
1000baseT/Full
Temporary fix As soon as EEE is disabled with 'ethtool --set-eee eth0 eee off', ethernet performance returns to normal at gigabit speeds.
Logs I am unsure which logs would be necessary to help diagnose the problem... please let me know and I will be happy to add them!
Equipment in use: Home network with BT Business Hub in use as modem and switch + Powerline ethernet with built in switch (BT Mini Connector)
A relevant forum thread: https://www.raspberrypi.org/forums/viewtopic.php?t=305820
I've read here: https://www.raspberrypi.org/forums/viewtopic.php?t=305820 that it cannot simply be disable via the ethtool command because it gets re-enabled later for whatever reason. One user fixed this with a script running in a constant loop that disables it again and again once a second, which I'd like to avoid.
Another user there asked: "Does anyone know if dtparam=eee=off still works if in config.txt?"
Could maybe somebody here tell if that still works with the Pi4 (I'd test it myself but I don't have an EEE enabled switch)? That would be a nice workaround without messy scripts etc.
I have put the entry into the config.txt and will update when a reboot comes to see if it sets the EEE off. :)
Can confirm dtparam=eee=off does not work on a Pi CM4, which makes sense since it's a Pi 3B+ specific thing according to the README (likely flips some stuff in the ethernet+USB chip the 3B+ uses)
Do we know what keeps turning EEE back on?
I think #3292 is facing the same issue.
Have the same issue. I had to replaced my old modem, which had a 100mb interface, with a new one that has a 1gb link. Since then I am facing constantly 10%-20% packet loss. Disabling eee solved it for me as well. It seems very much like an issue with rpi4. Any solution? eee is quite an old and robust tech, can't be that we have to disable this nowadays.
EEE is a hardware feature, and other than disabling it we have no control over how it works. It can be easily disabled with dtparam=eee=off.
It seems that the dtparam=eee=off hacks only work for Pi 3B+. Disabling EEE via ethtool --set-eee eth0 eee off is the proper way in Pi 4 and CM4. This issue is widely reported (1,2,3,4), and have been confirmed with multiple reputable routers/switches with EEE enabled(in #3292). Should we disable EEE for Pi 4 by default?
Should we disable EEE for Pi 4 by default?
No, because for the vast majority of users it saves power and causes no problems.
Should we disable EEE for Pi 4 by default?
No, because for the vast majority of users it saves power and causes no problems.
Do you have proof of that claim that it causes no problems? It seems like every rpi4 is affected and most just don't care to notice/do not notice. It becomes especially noticeable once you connect to a 1gb ethernet interface.
I have not seen any comment yet where someone says it works for him. If someone cares about power consumption and can handle constant connection losses, he can enable this feature. But as it is now, this is causing more trouble than what energy can be saved.
Do you have proof of that claim that it causes no problems?
Only my own experience, and that of other RPi engineers. Do you think we are using 100Mbps Ethernet, either in the office or at home?
I have not seen any comment yet where someone says it works for him.
People don't bother to open or reply to issues for things that work for them. Nice as that would be, this area is for problem reports. Given the age of this thread and the number of contributors, it isn't as widespread a problem as you seem to think.
If someone cares about power consumption and can handle constant connection losses, he can enable this feature.
The main consideration behind enabling EEE by default is the enormous number of Pis that are not and will never be connected by Ethernet. If the feature was opt-in, it would almost never be enabled.
If someone cares about power consumption and can handle constant connection losses, he can enable this feature.
The main consideration behind enabling EEE by default is the enormous number of Pis that are not and will never be connected by Ethernet. If the feature was opt-in, it would almost never be enabled.
This answer was about the claim that it works for the rest. It seems like it does not work at all for any RPI4, and we do not need to wait for someone to reply that it works. Just do a simple test, pick a random rpi4, connect it to a 1gbs interface and see for yourself.
By ignoring this hardware/design flaw, you basically knowingly let everyone suffer. In this state the RPI4 is useless for usage that is more than doing some homebrew experiments. Unless you are that poor guy who spends months debugging and eventually finding out that its this bug. Can this at least be somewhere documented if not fixed by disabling by default?
Question about:
The main consideration behind enabling EEE by default is the enormous number of Pis that are not and will never be connected by Ethernet. If the feature was opt-in, it would almost never be enabled.
Does the ethernet port consume any power at all if its not used/not connected where EEE would be even relevant?
Given the age of this thread and the number of contributors, it isn't as widespread a problem as you seem to think.
Unfortunately this is not the only place where people complain. The web is full with this bug.
I've got a Pi4 here, that has been connected to our internal 1GB/s network, for the last 14 days. 674 RX errors, 0 TX errors.
So, I have effectively done your experiment, and it clearly works (674 RX errors from 5368536 packets is insignificant)
I've got a Pi4 here, that has been connected to our internal 1GB/s network, for the last 14 days. 674 RX errors, 0 TX errors.
So, I have effectively done your experiment, and it clearly works (674 RX errors from 5368536 packets is insignificant)
Is 1gb also enabled? Can you post a terminal output of it please (ethtool --show-eee eth0)?
My CM4 running Raspberry Pi OS connected to a Netgear GS205
pi@raspberrypi:~ $ ethtool --show-eee eth0
EEE settings for eth0:
EEE status: enabled - active
Tx LPI: disabled
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: 100baseT/Full
1000baseT/Full
Link partner advertised EEE link modes: 100baseT/Full
1000baseT/Full
Pi4 running Ubuntu on the same switch:
pi@pi:~$ ethtool --show-eee eth0
EEE settings for eth0:
EEE status: enabled - active
Tx LPI: disabled
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: 100baseT/Full
1000baseT/Full
Link partner advertised EEE link modes: 100baseT/Full
1000baseT/Full
Both are connected at 1000Mb/s full duplex.
I've got a Pi4 here, that has been connected to our internal 1GB/s network, for the last 14 days. 674 RX errors, 0 TX errors. So, I have effectively done your experiment, and it clearly works (674 RX errors from 5368536 packets is insignificant)
Is 1gb also enabled? Can you post a terminal output of it please (ethtool --show-eee eth0)?
Same result as 6by9's post.
If true that's an interesting outcome. I would assume that if its a hardware design flaw it would affect all devices and it does not look like some physical breaking issue. Any idea where this issue might coming from? Firmware? Does the interaction between both link partners play a role for proper EEE usage?
I would need to try my rpi4 on a different router/modem and see if this issue always appears to rule out that the used other side plays a role here, but because any other device on my router/modem works fine with eee and 1gbs, I would say its on the rpi4's side.
There is clearly an incompatibility between some implementations of EEE - if you have an incompatible switch (note, I didn't say non-compliant or broken - compatibility requires both sides to get along) then it will probably fail 100% of the time, but conversely if you have a compatible switch it will be hard to understand what the fuss is about - it just works.
We have never claimed that the EEE implementation found on Pis works with all switches, but as I said above I don't think the non-working combinations are as common as you think.
My CM4 running Raspberry Pi OS connected to a Netgear GS205
That is very interesting. I happened to test Pi 4 (Raspberry Pi 4 Model B Rev 1.2, 4GB) against Netgear GS108Ev3 several days ago and have the packet loss issue. Where the ping loss is about 10-15%. I was not expecting there would be much difference between GS205 vs GS108Ev3 except for the plastic or metal enclosure.
So, I have effectively done your experiment, and it clearly works (674 RX errors from 5368536 packets is insignificant)
Since the issue is caused by packet loss rather than receiving errors, I think the RX errors are not telling the amount of real lost packets. Some packets may never reach the driver to set the RX error counter before getting lost. Can you do a ping test from another machine on the same network and let it run for about 5-10 mins to see if you got any packet lost? In this test, the Pi needs to connect to a EEE enabled switch(ethtool shows EEE status: enabled - active).
People don't bother to open or reply to issues for things that work for them.
Note that the SSH connection is usually ok except for some random lags when the packet is lost during typing and echo. The same for other TCP applications. So I think maybe many people will not notice the problem.
Note that the SSH connection is usually ok except for some random lags when the packet is lost during typing and echo. The same for other TCP applications. So I think maybe many people will not notice the problem.
That is exactly what I have experienced also with my previous modem that had a 100mb interface. I guess it's less noticeable then. Even with the 1gbs connection, I am able to use ssh, also the application runs okayish. Issue was that I had random connection losses, timeouts etc and with switching to the new modem (1gbs) it got worse, my containers constantly crashed and had to restart whole day.
Since I disabled eee, ssh runs very smooth and so far no connectivity issues.
I have the Pi4B 8gb, running multiple high traffic applications on it.
FWIW, at home with a Pi4 running TVHeadend running Buster (5.10.63 kernel), connected to a Netgear GS750E
pi@raspberrypi:~ $ uptime
18:23:18 up 5 days, 23:02, 1 user, load average: 0.10, 0.12, 0.09
pi@raspberrypi:~ $ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.2.250 netmask 255.255.255.0 broadcast 192.168.2.255
inet6 fe80::cfb5:7c6c:486d:ea29 prefixlen 64 scopeid 0x20<link>
ether dc:a6:32:xx:xx:xx txqueuelen 1000 (Ethernet)
RX packets 1982298 bytes 529189652 (504.6 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 934646 bytes 1217709716 (1.1 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 67166 bytes 3379900 (3.2 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 67166 bytes 3379900 (3.2 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
wlan0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether dc:a6:32:00:a9:4a txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
pi@raspberrypi:~ $ ethtool --show-eee eth0
EEE Settings for eth0:
EEE status: enabled - active
Tx LPI: disabled
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: 100baseT/Full
1000baseT/Full
Link partner advertised EEE link modes: 100baseT/Full
1000baseT/Full
Copying the script from https://forums.raspberrypi.com/viewtopic.php?p=1830016#p1830016 to ping my router (on the another port of the GS750E, but one that is running multiple VLANs), and I get no dropped packets.
Since the issue is caused by packet loss rather than receiving errors, I think the RX errors are not telling the amount of real lost packets. Some packets may never reach the driver to set the RX error counter before getting lost. Can you do a ping test from another machine on the same network and let it run for about 5-10 mins to see if you got any packet lost? In this test, the Pi needs to connect to a EEE enabled switch(ethtool shows EEE status: enabled - active).
So, did a 10 or so minute run pinging from Pi 4 to a Pi 3. 722 packets sent, no packet loss. EEE enabled as before.
This issue is not specific to the Pi 4. It can, and does, cause problems on just about any device. I work for an MSP managing several thousand devices and we've seen EEE issues with NICs from every major vendor. Realtek seem to be the worst for it, but some Intel/Broadcom NICs are just as bad, and it varies between individual units of the same model.
EEE is known to be a common cause of weird intermittent packet loss across most of the industry; The switch that's in use makes a much bigger difference than the NIC/device. Anecdotally, we have the most problems with $20 unmanaged D-Link/Netgear switches (people like to buy their own rather than just ask for one, for some reason) and the least trouble with big shiny expensive enterprise managed ones, but it's really hit-and-miss.
As an example, the Aruba 2530 8+2-port PoE managed switch has an EEE implementation that's absolutely garbage - all of the ~6 Pi4/CM4s I have exhibit this problem when connected to it with EEE enabled, along with quite a few other things - but the higher-port-count models in the same lineup are fine 🤷
Point being, while the Pi 4 does seem to have a somewhat higher incidence of EEE problems than the norm, it's not particularly out of the ordinary, but is more of a reflection on the EEE standard itself having compatibility problems in general than a problem with the Pi 4 specifically.
I don't think there's any real argument for disabling EEE by default - it works on the vast majority of switches, for the vast majority of people, the vast majority of the time, and it's not hard at all to find the solution once you search the web for 'pi ethernet packet loss', 'pi network dropout', etc. if it does happen to affect you.
It would maybe be nice to have a bundled-in systemd unit (disabled by default) to execute the ethtool command at startup, but it's trivial to make one yourself once you know what the issue is; as far as I can tell no major Linux distro has such a thing, implying that it's not common enough of an issue to be worth creating/packaging, which kind of says it all.
Thank you guys for all the effort with testing and explaining in detail. Especially the last answer was quite helpful to understand the whole picture with eee.
Thank @6by9 and @JamesH65 a lot for the prompt testing results. That is solid proof that the EEE of Pi 4 is working well for other switches. Meanwhile, many thanks to @neggles for giving insights into how common the EEE issues are across the industry. It's true that disabling EEE by default is not necessary given that EEE works for most people.
It would maybe be nice to have a bundled-in systemd unit (disabled by default) to execute the ethtool command at startup, but it's trivial to make one yourself once you know what the issue is;
Yes, I also agree that having bundled-in scripts to execute the ethtool command will be handy. Especially I found doing this correctly is not that trivial. The ethtool --set-eee eth0 eee off command fails if the interface is not brought up. So running this command in rc.local or systemd units may have no effects. On the other hand, successfully running the ethtool command forces renegotiation of ethernet modes, which interrupts all network connections for several seconds. So one might want to have this done as early as possible during boot up.
Yes, I also agree that having bundled-in scripts to execute the ethtool command will be handy. Especially I found doing this correctly is not that trivial. The
ethtool --set-eee eth0 eee offcommand fails if the interface is not brought up. So running this command in rc.local or systemd units may have no effects. On the other hand, successfully running the ethtool command forces renegotiation of ethernet modes, which interrupts all network connections for several seconds. So one might want to have this done as early as possible during boot up.
I can help with that; save this as /etc/systemd/system/[email protected]:
[Unit]
Description=Disable EEE on %i on startup
Wants=network.target network-online.target
After=network-online.target
[Service]
Type=simple
RemainAfterExit=true
# Uncomment the below to do a slightly hacky check for whether the link is up.
#ExecStartPre=/bin/bash -c '[ $(cat /sys/class/net/%i/carrier) == "1" ]'
ExecStart=/sbin/ethtool --set-eee %i eee off
# If we get an unclean exit code, retry
Restart=on-failure
# Wait 30s before retrying
RestartSec=30s
[Install]
WantedBy=multi-user.target
then run sudo systemctl daemon-reload && sudo systemctl enable --now [email protected] - optionally you can omit the @ from the unit file name and hardcode the interface name by replacing %i with eth0/en<blah> depending on what your distro calls it.
This will attempt to set EEE off once the network has come online, and will try again every 30s until it succeeds. Optionally, if you uncomment ExecStartPre it will check whether the interface is up before attempting to set the EEE state - this is what I use on my own Pis (and other misbehaving devices), and so far I've found it to be reliable.
The other option would be to write up a patch for the bcmgenet driver adding an eee parameter, so that passing bcmgenet.eee=0 in the kernel command line would disable EEE at initialization; the igb driver used to have such a parameter, but it seems to have since been removed, so upstream might not be very willing to take such a patch 🤷
I've been looking at the bcmgenet driver and its EEE support. It seems that currently EEE is not explicitly enabled, and yet ethtool thinks/knows it is enabled. I also found that I can't re-enable EEE once it has been enabled - there is a check that the PHY supports EEE advertising(?), and the check fails.
If I were to create a Pull Request with a patch to the bcmgenet driver, would anyone here who is experiencing EEE problems be happy to apply the patch and build their own kernel to test?
There's a PR - #5277 - that adds a module parameter and a dtparam (which just sets the module parameter).
Either add genet.eee=N to /boot/cmdline.txt or add dtparam=eee=off to config.txt.