linux icon indicating copy to clipboard operation
linux copied to clipboard

Wrong Time stamping parameters for eth0 on Raspberry Pi 4 for Kernel 5.15.50

Open wbb12 opened this issue 3 years ago • 26 comments

Describe the bug

I want to use the IEEE 1588 PTP in software time stamping to sync the time between multiple 4B PIs, though the software time stamping has only a slight advantage over NTP. In the newest 64-bit official OS image, the kernel of which is 5.15.32, the eth0 supports the software time stamping by "ethtool -T eth0". image However, when I build the new kernel following the step in https://www.raspberrypi.com/documentation/computers/linux_kernel.html, the kernel version is 5.15.50. The ethtool shows that the interface support only hardware time stamping. image

The PHY used on RPi 4 is BCM54213PE which doesn't support IEEE 1588. If forcing the ptp4l using hardware time stamping, it will return errors of "bad message". Because the software supports are gone missing, I cannot use software time stamping either.

How can I modify the .config file before compiling the kernel to fix this bug?

Steps to reproduce the behaviour

5.15.50-v8 kernel and cmd "ethtool -T eth0" will show the error on RPI 4

Device (s)

Raspberry Pi 4 Mod. B

System

Raspberry Pi reference 2022-04-04 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 27a8050c3c06e567c794620394a8c2d74262a516, stage2

Logs

No response

Additional context

No response

wbb12 avatar Jul 25 '22 08:07 wbb12

On a locally-compiled 5.15.45 the issue is not present. On a fresh rpi-update kernel (5.15.56) the issue is present. Bisecting...

P33M avatar Aug 15 '22 10:08 P33M

Good: 93baedb93739 configs: Rengenerate Bad: 4ee9d0f86680 Merge remote-tracking branch 'stable/linux-5.15.y' into rpi-5.15.y The bad commit is the 5.15.50 upstream merge. Reverting b761f6dd3619fd9cf6f193a8363aab8600676e3f makes 5.15.50 a good commit.

commit b761f6dd3619fd9cf6f193a8363aab8600676e3f
Author: Dom Cobley <[email protected]>
Date:   Mon Jun 27 12:43:12 2022 +0100

    config: Enable CONFIG_NETWORK_PHY_TIMESTAMPIN on 2711

    See: https://github.com/raspberrypi/linux/issues/4151

P33M avatar Aug 15 '22 12:08 P33M

b8e68d153d429de337d173e6bfc295f1d8ba557d incorrectly adds the ID for BCM54213PE to the bcm_ptp_probe function.

As you're building locally, can you try reverting it on top of latest rpi-5.15.y and test?

P33M avatar Aug 15 '22 14:08 P33M

Thank you! I comment out the BCM54213PE lines in the bcm_ptp_probe function and rebuild the kernel. The timestamp returns to normal. image

wbb12 avatar Aug 19 '22 07:08 wbb12

Anyone know if this has been fixed in the latest kernel version (5.15.68)? I'm going to build it but that will take a few hours. I'll post my results here too once I have some...

AndrewLuebke avatar Sep 23 '22 18:09 AndrewLuebke

Nothing has changed.

I was told to cherry-pick the commits needed to support https://github.com/raspberrypi/linux/issues/4151 But it's not something I have knowledge of, or ability to test. Reverting any of these will (I believe) stop CM4 IEEE1588/PTP functionality.

If @P33M knows what is needed to support CM4 and not break Pi4 then we may be able to proceed.

popcornmix avatar Sep 23 '22 18:09 popcornmix

I think https://github.com/raspberrypi/linux/commit/b8e68d153d429de337d173e6bfc295f1d8ba557d needs to be reverted. BCM54213PE does not support PTP and including its device ID for PTP is a bug.

P33M avatar Sep 23 '22 18:09 P33M

I agree.

pelwell avatar Sep 23 '22 20:09 pelwell

The commit was requested here: https://github.com/raspberrypi/linux/issues/4151#issuecomment-1165736944 Is the second commit okay?

popcornmix avatar Sep 24 '22 10:09 popcornmix

Yes, it looks like it just aligns downstream with the new upstream phy features.

P33M avatar Sep 24 '22 13:09 P33M

Commit has been reverted and rpi-update kernel includes this revert.

popcornmix avatar Sep 25 '22 17:09 popcornmix

Ok searched a bit on menu config but I can not figure it out.

How to enable the support to CM4 for HW timestamps on new kernel? On kernel 5.15.65 everything was fine. The revert is only for RPI 4B or for the CM4 also?

Regards.

Edit

I just compile it again still not showing anything on ethtool.

εικόνα

εικόνα

I am not seeing any config to enable it^^^^^

Maybe I am doing something wrong.

jnk19 avatar Sep 30 '22 16:09 jnk19

Any news about CM4 for enabling HW timestamp? Tried kernel 5.15.72 with the same results.

jnk19 avatar Oct 07 '22 20:10 jnk19

Don't cross the streams - you want this thread: https://github.com/raspberrypi/linux/issues/4151

pelwell avatar Oct 07 '22 20:10 pelwell

@wbb12 I'm a little confused. Why can't you just use software timestamping? The fact that the in-kernel software timestamping is not supported with the so_timstamping api does not prevent you from using software timestamping does it? I accept that none of this is ideal, but the problem is the two Broadcom controllers present the same IDs... I am currently pleading with Broadcom to come up with a fix for this.

lasselj avatar Nov 02 '22 17:11 lasselj

@lasselj I suppose that the "software transmit" is not in the time-stamping capabilities of eth0, so the ptp4l cannot be forced into software mode. For other NICs that really support hardware time-stamping, software time-stamping is a default capability.

wbb12 avatar Nov 02 '22 17:11 wbb12

Yeah, they are both part of the so_timestamping subsystem. I don’t understand LinuxPTP well, but in Timebeat if you had neither option we capture a timestamp just before the send call and just after the recv call in user land (actually we capture it when epoll says it has business). The difference between that strategy and the software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) and software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) strategy is pretty negligible. Maybe LinuxPTP does not support this approach. #DownloadTimebeat

All the best,

Lasse

On 2 Nov 2022, at 17:17, Wang Binbin @.***> wrote:

@lasselj https://github.com/lasselj I suppose that the "software transmit" is not in the time-stamping capabilities of eth0, so the ptp4l cannot be forced into software mode. For other NICs that really support hardware time-stamping, software time-stamping is a default capability.

— Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/5104#issuecomment-1300955183, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCEZECWEPGW5QQMZKMWHQTWGKO4PANCNFSM54RKYNRA. You are receiving this because you were mentioned.

lasselj avatar Nov 02 '22 17:11 lasselj

I think b8e68d1 needs to be reverted. BCM54213PE does not support PTP and including its device ID for PTP is a bug.

I don't think it's as simple as this. The 4B and the CM4 unfortunately have the same PHY id, even though they have different hardware: /sys/class/net/eth0/phydev/phy_id contains 0x600d84a2 on both of them. PHY_ID_BCM54213PE is 0x600d84a2. Thus reverting b8e68d1 will break things on the CM4.

It seems to me to not a good tradeoff to fix software PTP timestamping on the 4B at the cost of breaking hardware PTP timestamping on the CM4. The main value of PTP comes with hardware timestamping: if you only have software timestamping you lose very little if you use NTP rather than PTP. By contrast, the PTP hardware timestamping support in the CM4 has enormous potential for the PTP community. It is currently by far the cheapest way to implement a high-quality PTP grandmaster, since the PTP extts support, which is essential for this, is available on very few NICs.

A good solution to this issue needs to find a way to make PTP software timestamping to work on the 4B with linuxptp, without breaking things on the CM4. Ptp4l has a -S configuration option that is supposed to force it to use software timestamping. But this requires that the network driver support SOF_TIMESTAMPING_TX_SOFTWARE|SOF_TIMESTAMPING_RX_SOFTWARE|SOF_TIMESTAMPING_SOFTWARE, and bcm-phy-ptp does not support this.

I have another machine with an i211 and the igb driver for that supports both hardware and software timestamping. igb_get_ts_info does

   case e1000_i211:
		info->so_timestamping =
			SOF_TIMESTAMPING_TX_SOFTWARE |
			SOF_TIMESTAMPING_RX_SOFTWARE |
			SOF_TIMESTAMPING_SOFTWARE |
			SOF_TIMESTAMPING_TX_HARDWARE |
			SOF_TIMESTAMPING_RX_HARDWARE |
			SOF_TIMESTAMPING_RAW_HARDWARE;

So I wondered whether the fix was as simple as doing the same in bcm_ptp_ts_info.

Edited. This seems to make things work on the 4B side. I am doing sudo ptp4l -s -m -q -l 7 -i eth0 -S on a 4B. (It didn't work the first time, because I had another instance of ptp4l running without -S.)

The patch is just:

diff --git a/drivers/net/phy/bcm-phy-ptp.c b/drivers/net/phy/bcm-phy-ptp.c
index ef00d6163..54e92a6a9 100644
--- a/drivers/net/phy/bcm-phy-ptp.c
+++ b/drivers/net/phy/bcm-phy-ptp.c
@@ -850,6 +850,9 @@ static int bcm_ptp_ts_info(struct mii_timestamper *mii_ts,

        ts_info->phc_index = ptp_clock_index(priv->ptp_clock);
        ts_info->so_timestamping =
+               SOF_TIMESTAMPING_TX_SOFTWARE |
+               SOF_TIMESTAMPING_RX_SOFTWARE |
+               SOF_TIMESTAMPING_SOFTWARE |
                SOF_TIMESTAMPING_TX_HARDWARE |
                SOF_TIMESTAMPING_RX_HARDWARE |
                SOF_TIMESTAMPING_RAW_HARDWARE;
@@ -915,6 +918,9 @@ struct bcm_ptp_private *bcm_ptp_probe(struct phy_device *phydev)

        switch (BRCM_PHY_MODEL(phydev)) {
        case PHY_ID_BCM54210E:
+#ifdef PHY_ID_BCM54213PE
+       case PHY_ID_BCM54213PE:
+#endif
                break;
        default:
                return NULL;

In this message, it says that the only thing needed for a driver to support software timestamping is a call to skb_tx_timestamp(skb), and bcmgenet does that.

jclark avatar Nov 03 '22 07:11 jclark

@popcornmix I think not a bad compromise.... what approach do you prefer? I can submit patch to upstream or @jclark can do it. ptp4l will then work on the 4revB when forced to swts. Let me know.

lasselj avatar Nov 03 '22 14:11 lasselj

@pelwell @P33M are you happy with this approach?

popcornmix avatar Nov 03 '22 15:11 popcornmix

I think we should keep it as a reserve option. I'm wondering if there is a way to resolve the ambiguity some other way.

Device Tree is an option, but for the limited case of CM4 vs 4B I think there's an easier way:

diff --git a/drivers/net/phy/bcm-phy-ptp.c b/drivers/net/phy/bcm-phy-ptp.c
index ef00d6163061..30ee23e73a4b 100644
--- a/drivers/net/phy/bcm-phy-ptp.c
+++ b/drivers/net/phy/bcm-phy-ptp.c
@@ -916,6 +916,18 @@ struct bcm_ptp_private *bcm_ptp_probe(struct phy_device *phydev)
        switch (BRCM_PHY_MODEL(phydev)) {
        case PHY_ID_BCM54210E:
                break;
+#ifdef PHY_ID_BCM54213PE
+       case PHY_ID_BCM54213PE:
+               switch (phydev->mdio.addr) {
+               case 0: // CM4 - this is a BCM54210PE which supports PTP
+                       break;
+               case 1: //  4B - this is a BCM54213PE which doesn't
+                       return NULL;
+               default: // Unknown - assume it's BCM54210PE
+                       break;
+               }
+               break;
+#endif
        default:
                return NULL;
        }

This code is untested, other than a quick compile-check, but I think it should work. Note that it's written verbosely for clarity, and that I'm undecided about the default case for an unknown address.

pelwell avatar Nov 03 '22 15:11 pelwell

I tested the patch from @pelwell on a 4B, and it works as expected. I haven't tried it on a CM4 yet. Assuming it also works on the CM4, I agree it's a better solution: a 4B PTP user would have a hard time figuring out that they needed the -S option.

For the default unknown address case, maybe the approach I proposed earlier would make sense i.e. have the driver say it supports software timestamping and then treat the unknown address case like a CM4. I haven't tested software timestamping on the CM4, but I don't see any reason why it shouldn't work.

jclark avatar Nov 04 '22 02:11 jclark

Is there any more feedback on this, before I merge my patch?

pelwell avatar Nov 05 '22 13:11 pelwell

Hi Phil,

Sorry for the delay. I confirm this works on the CM4 and enables HWTS.

Great solution. Thank you for highlighting it! Merge away… :-)

All the best,

Lasse

On 5 Nov 2022, at 13:42, Phil Elwell @.***> wrote:

Is there any more feedback on this, before I merge my patch?

— Reply to this email directly, view it on GitHub https://github.com/raspberrypi/linux/issues/5104#issuecomment-1304549100, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCEZEBAFTMPNJHSQJVHRYDWGZP3PANCNFSM54RKYNRA. You are receiving this because you were mentioned.

lasselj avatar Nov 05 '22 15:11 lasselj

Done - see https://github.com/raspberrypi/linux/commit/13d915df82494bee901b8d4e9045242e39861761.

pelwell avatar Nov 05 '22 17:11 pelwell

@pelwell I wonder if this fix is working on the Raspberry Pi 5. This user reports suggests to me that it isn't:

https://lists.nwtime.org/sympa/arc/linuxptp-users/2024-02/msg00012.html

This error I think means that the kernel is exposing a PTP hardware clock with external timestamping support on the Pi 5, which I don't believe it has. Unfortunately, I don't yet have a Pi 5 to investigate further.

jclark avatar Feb 14 '24 23:02 jclark