BPI-Router-Linux icon indicating copy to clipboard operation
BPI-Router-Linux copied to clipboard

SFP interface does not work on you 6.9-rc branch

Open jcdutton opened this issue 3 months ago • 36 comments

Board: BPI-R3 sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn REDACTED dc REDACTED ethtool looks like something it connected, but the link is always down: 3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether 7a:3b:38:39:40:41 brd ff:ff:ff:ff:ff:ff root@bpi-r3:/etc/hostapd# ethtool eth1 Settings for eth1: Supported ports: [ MII ] Supported link modes: 2500baseX/Full 2500baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 2500baseX/Full 2500baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 2500Mb/s Duplex: Full Auto-negotiation: off Port: MII PHYAD: 0 Transceiver: internal Current message level: 0x000000ff (255) drv probe link timer ifdown ifup rx_err tx_err Link detected: no

By comparison. The SFP works on openwrt. Also, commands take a long time to execute, so something is blocking.

jcdutton avatar Mar 30 '24 13:03 jcdutton

Make sure the pcs driver (lynxi on r3/r4,xfi on r4 for 10g) is selected.

frank-w avatar Mar 30 '24 14:03 frank-w

The lynxi is selected. But no link is established.

jcdutton avatar Mar 30 '24 18:03 jcdutton

The attached patch fixes the SFP for me. sfp-fix1.diff.txt

jcdutton avatar Mar 30 '24 21:03 jcdutton

So you use the oem 2g5 sfp i also have?

Your patch basicly reverts erics patch right?

https://github.com/frank-w/BPI-Router-Linux/commit/dd3dd754cdc7cb1e61f6588eaa6b8ee15896b42b

Have you added realtek phy driver from him?

frank-w avatar Mar 30 '24 21:03 frank-w

I don't think my transceiver uses a realtek phy. What command do I use to identify the phy?

jcdutton avatar Mar 30 '24 23:03 jcdutton

The SFP transceiver presents I2C interface on 0x50, 0x51, 0x56. So, it looks to me that it supports multiple different protocols over I2C. For example, 0x51 has this: CNS8TUTAAC30-1410-04V04 and GLC-T which looks strangely similar to: Cisco GLC-T 1000BASE-T SFP Module - 30-1410-04 CNS8TUTAAC

I don't think it supports MDIO. I think this confuses the sfp.c probing, that results in it choosing the swphy. A software emulation for a phy.

jcdutton avatar Mar 31 '24 02:03 jcdutton

i2cdump -y 0 0x56
No size specified (using byte-data access)
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f    0123456789abcdef
00: 10 00 79 49 4f 51 ea 19 11 e1 00 00 00 64 20 01    ?.yIOQ????...d ?
10: 00 00 02 00 00 00 00 00 00 00 00 00 00 00 20 00    ..?........... .
20: 00 62 00 00 00 00 00 00 08 2c 00 00 00 00 00 00    .b......?,......
30: 00 00 00 00 00 00 00 00 00 00 00 00 a0 00 00 00    ............?...
40: 09 40 41 c9 4f 51 ea 19 00 20 00 00 00 00 00 00    ?@A?OQ??. ......
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 00    ..............?.
60: 00 00 20 02 00 04 00 00 70 28 00 00 00 00 00 00    .. ?.?..p(......
70: 00 00 00 00 00 00 00 00 00 00 00 00 30 fc 80 01    ............0???
80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
90: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
a0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
b0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
c0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
d0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
e0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................
f0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff    ................

Whereas someone else saw this: At offset 4 it has: 0x001cc849 , which means that we have this PHY at 0x56:

		PHY_ID_MATCH_EXACT(0x001cc849),
		.name           = "RTL8221B-VB-CG 2.5Gbps PHY",

I see offset 4 has: 0x4f51ea19 instead of 0x001cc849 so I need to find with PHY 0x4f51ea19 is.

jcdutton avatar Mar 31 '24 13:03 jcdutton

You should post this to bpi-forum where eric can take a look at it. I'm no phy specialist :p but 0x56 looks like rollball protocol and good you can access this address...sometimes it is 0xff only

You have quoted eric above from here: https://forum.banana-pi.org/t/sfp-oem-sfp-2-5g-t-kernel-phy/15872/13?u=frank-w

frank-w avatar Mar 31 '24 14:03 frank-w

I tried with the rollball and I added some debug. It comes back saying rollball not found.

jcdutton avatar Mar 31 '24 15:03 jcdutton

@frank-w Apparently the 0x4f51ea19 is a motorcomm YT8821 PHY.

jcdutton avatar Mar 31 '24 17:03 jcdutton

Have you deleted the post with the mii patch?

frank-w avatar Apr 04 '24 19:04 frank-w

Yes, I did delete it. It did not actually work. I am working on a different patch. Just trying to work out where exactly the *2 needs to happen in the code between C22 reg and i2c messages.

jcdutton avatar Apr 04 '24 20:04 jcdutton

Well, on the positive side, I have the yt8821 phy working using C22. Notice the added "<< 1" to do the reg * 2. On the negative side, I have not worked out how to do the force auto neg off, so that the interface passes traffic. The remote side is seeing link up.

static int i2c_mii_read_default_c22(struct mii_bus *bus, int phy_id, int reg)
{
        return i2c_mii_read_default_c45(bus, phy_id, -1, reg << 1);
}

static int i2c_mii_write_default_c22(struct mii_bus *bus, int phy_id, int reg,
                                     u16 val)
{
        return i2c_mii_write_default_c45(bus, phy_id, -1, reg << 1, val);
}

Here is the output from the C22:

ethtool sfp1                                                                                                                                                            
Settings for sfp1:                                                                                                                                                                     
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full     
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 2500Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: Twisted Pair
        PHYAD: 22
        Transceiver: external                                                              
        MDI-X: Unknown
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: no                       

jcdutton avatar Apr 04 '24 22:04 jcdutton

In C22 mode, the motorcomm yt8821 phy is setting link up in phydev->link = 1. I have not yet found out how to propagate that to the ethtool "Link detected" status. Does anyone have an idea how?

jcdutton avatar Apr 05 '24 01:04 jcdutton

I2c is byte oriented, mii is word oriented, so this is where the difference comes from.

ericwoud avatar Apr 05 '24 10:04 ericwoud

I have it almost fixed. The PHY link is up, the MAC link is down. For ethtool to show a "Link detected" both have to be up. Does anyone know when the MAC link should be brought up and how to do that? Which function to call where? I guess the options are:

  1. When a user does "ifconfig sfp1 up" or similar the MAC link could come up. The the MAC link being up would be ready for the PHY link coming up.
  2. The MAC link stays down, saving power, until the PHY link is up, and then activates the MAC link to match.
In: drivers/net/phy/phylink.c: phylink_resolve()
if (pl->phydev)
         link_state.link &= pl->phy_state.link;

pl->phy_state.link == 1      <--- The PHY state
link_state.link == 0             <--- The MAC state.

jcdutton avatar Apr 05 '24 10:04 jcdutton

It probably doesn't come up, because 2500base-x inband is not supported by mac.

ericwoud avatar Apr 05 '24 10:04 ericwoud

I just need the MAC to come up at 2500, irrespective of what the PHY is doing. The PHY does the rate adjustment.

jcdutton avatar Apr 05 '24 10:04 jcdutton

Is there an IRC or similar channel we can chat on. It might be quicker to resolve this?

jcdutton avatar Apr 05 '24 10:04 jcdutton

I2c is byte oriented, mii is word oriented, so this is where the difference comes from.

So is this a bug? Will putting the "reg << 1" in the code, fix C22 mode for multiple Transceivers? Thus not needing rollball any more? Can anyone test with other transceivers. I only have one make of transceiver here. I put this in the fixup to force it to try C22: sfp->mdio_protocol = MDIO_I2C_MARVELL_C22; sfp->id.base.extended_cc = SFF8024_ECC_2_5GBASE_T;

jcdutton avatar Apr 05 '24 10:04 jcdutton

i'm not on irc

i had added erics patches where the last disables inband for 2.5G, maybe you only need a quirk setting the 2.5g for your SFP like we have done for other SFPs

frank-w avatar Apr 05 '24 11:04 frank-w

Will putting the "reg << 1" in the code, fix C22 mode for multiple Transceivers? Thus not needing rollball any more?

Cannot add << 1, it will break every C22 phy out there. Other modules will still need RollBall protocol. This module, I'll have to take a closer look, what's really going on.

ericwoud avatar Apr 05 '24 14:04 ericwoud

@ericwoud Just to clarify. Say we wish to read the ID from device 0x56 using C22. (as is done in phy_device.c: get_phy_c22_id() #define MII_PHYSID1 0x02 #define MII_PHYSID2 0x03 The read of 0x02 needs to be converted into reading offset 0x04 (2 octets) from i2c device 0x56. The read of 0x03 needs to be converted into reading offset 0x06 (2 octets) from i2c device 0x56.

So, a conversion from 0x02 to 0x04 has to happen somewhere. Where should that happen?

jcdutton avatar Apr 05 '24 17:04 jcdutton

I have not managed to coax the MAC to turn on its link while the yt8821 PHY has a link. So, even with the transceiver reporting link up and link speed, the MAC is not turning on so no traffic passes. Its a bit like wack a mole. The PHY sends the netif_carrier_on() to make the link up, and the MAC phy_link code turns it off again because the MAC is not UP. For traffic to pass both MAC and PHY links need to be up. Is there a command line to use to force the MAC to be UP ?

jcdutton avatar Apr 05 '24 19:04 jcdutton

Have you tried the quirk approach to set the speed to 2500baseT for the sfp?

frank-w avatar Apr 05 '24 19:04 frank-w

I have tried various quirks without luck. Please suggest a code snippet of the quirk you are suggesting

jcdutton avatar Apr 05 '24 19:04 jcdutton

similar to this, but with your vendor/product

https://github.com/frank-w/BPI-Router-Linux/commit/e27aca3760c08b7b05aea71068bd609aa93e7b35

can you pls post part of your dmesg where sfp is recognized for vendor-string and/or output of ethtool -m ethX

frank-w avatar Apr 05 '24 19:04 frank-w

Hi. With that e27aca3 the sfp passes ethernet traffic, but no SFP PHY is detected because no PHY is even looked for.

If I change the quirks for force a C22 PHY, it detects the yt8821 phy and uses it, but the MAC is disabled so no traffic flows.

I guess I am looking for a quirk that forces "pl->cur_link_an_mode=0x1" instead of "pl->cur_link_an_mode=0x2" I.e. "Fixed" instead of "in-band".

jcdutton avatar Apr 05 '24 20:04 jcdutton

guess I am looking for a quirk that forces "pl->cur_link_an_mode=0x1" instead of "pl->cur_link_an_mode=0x2" I.e. "Fixed" instead of "in-band".

I already pointed you the patch that does that in the forum

And you want mode phy instead of fixed to replace inband

ericwoud avatar Apr 05 '24 21:04 ericwoud

I have diagnosed the problem. When the yt8821 is being used, it tries to configure the mac interface type from "0x17:2500base-x" to "0x4:sgmii" and this stops the MAC working, as it wishes to be left at "0x17:2500base-x". I think the code in phylink.c is a little confusing. It does not keep state of the PHY separate from the state of the MAC, and thus when the PHY changes, it tries to set the MAC to match it, and this obviously fails. Note, of course, maybe SFP transceivers wish the PHY to match the MAC, but that will not work with the BPI-R3 and the particular transceiver I have. In my case, the PHY is in a different interface mode, and the MAC should be fixed at 2500base-x, no matter what the PHY is doing. This is probably a lot of work to do in the phylink.c in order to keep separate the state of the PHY and the MAC, and also keep details of what the MAC should do depending on the PHY. Note: In order to prove the above, I put in some code around whenever the MAC is asked to change, that if it was a particular interface name, suppress the changing of the MAC rate away from the 0x17 value. Its not a good fix, but proved the point. Here is an example of the problem code. It is happily mixing PHY and MAC interface state:

        if (pl->cur_link_an_mode != mode ||
            pl->link_config.interface != state->interface) {
                pl->cur_link_an_mode = mode;
                pl->link_config.interface = state->interface;
                pl->phy_state.interface = state->interface;

                changed = true;
                phylink_info(pl, "switched to %s/%s link mode\n",
                             phylink_an_mode_str(mode),
                             phy_modes(state->interface));
        }

        if (changed && !test_bit(PHYLINK_DISABLE_STOPPED,
                                 &pl->phylink_disable_state))
                phylink_mac_initial_config(pl, false);

Now, there are some other things I need to fix with regards to the yt8821 code as it does not appear to be reading available and advertised link speeds correctly from the PHY, but that is for another day.

jcdutton avatar Apr 06 '24 23:04 jcdutton