Scanning PHYs that only support c45 doesn't work
I have an SFP module with a built-in Aquantiq AQR113C PHY. It is accessed via the rollball protocol. The mdio application has a problem accessing such PHYs:
...
[ 29.762375] mtk_soc_eth 15100000.ethernet sfp-lan: PHY [i2c:sfp2:11] driver [Aquantia AQR113C] (irq=POLL)
[ 29.924656] mtk_soc_eth 15100000.ethernet sfp-wan: configuring for inband/10gbase-r link mode
[ 29.957040] br-wan: port 2(sfp-wan) entered blocking state
[ 29.962547] br-wan: port 2(sfp-wan) entered disabled state
[ 29.968045] mtk_soc_eth 15100000.ethernet sfp-wan: entered allmulticast mode
[ 29.975199] mtk_soc_eth 15100000.ethernet sfp-wan: entered promiscuous mode
root@OpenWrt:~# mdio
fixed-0
i2c:sfp2
mdio-bus
mt7530-0
root@OpenWrt:~# mdio i2c:sfp2
ERROR: Unable to read status (-95)
In contrast, another SFP module with c22 PHY accessed via i2c address 0x56 works fine:
...
[ 107.671312] sfp sfp2: module FS SFP-GB-GE-T rev F sn F2032210361 dc 210303
[ 107.777196] mtk_soc_eth 15100000.ethernet sfp-lan: switched to inband/sgmii link mode
[ 107.934790] mtk_soc_eth 15100000.ethernet sfp-lan: PHY [i2c:sfp2:16] driver [Marvell 88E1111] (irq=POLL)
[ 111.115815] mtk_soc_eth 15100000.ethernet sfp-lan: Link is Up - 1Gbps/Full - flow control rx/tx
[ 111.115843] br-lan: port 4(sfp-lan) entered blocking state
[ 111.129987] br-lan: port 4(sfp-lan) entered forwarding state
root@OpenWrt:~#
root@OpenWrt:~# mdio
fixed-0
i2c:sfp2
mdio-bus
mt7530-0
root@OpenWrt:~# mdio i2c:sfp2
DEV PHY-ID LINK
0x16 0x01410cc2 up
Is there anything that can be done? :)
I doubt that it has anything to do with the fact that it is C45-over-Snowball-over-I2C. mdio-netlink just defers to the kernel drivers to sort that out.
Today the bus_status() logic, i.e. the code that runs on mdio <BUS>, assumes a C22 bus:
https://github.com/wkz/mdio-tools/blob/cd8a90801974afc64eabea664f15095b87dc289c/src/mdio/bus.c#L8-L53
In other words, there is no C45 probing support on any bus, but this is certainly something we could (should!) add.
If you know the address of the Aquantia PHY, you should be able to access it with mdio i2c:sfp2 mmd <port>:<dev> - even though the bus probing is not in place.
In other words, there is no C45 probing support on any bus, but this is certainly something we could (should!) add.
Support for c45 would be very useful for debugging SFP modules. Do you have any plans to add this functionality?
If you know the address of the Aquantia PHY, you should be able to access it with
mdio i2c:sfp2 mmd <port>:<dev>- even though the bus probing is not in place.
Reading all registers also doesn't work. However, reading individual registers works.
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:30
ERROR: Unable to read status (-110)
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:30 raw 0x2
0x31c3
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:30 raw 0x3
0x1c13
In other words, there is no C45 probing support on any bus, but this is certainly something we could (should!) add.
Support for c45 would be very useful for debugging SFP modules. Do you have any plans to add this functionality?
Hand-on-heart: probably not until I find myself needing it 😄
I'd be happy to accept a PR that adds it though.
If you know the address of the Aquantia PHY, you should be able to access it with
mdio i2c:sfp2 mmd <port>:<dev>- even though the bus probing is not in place.Reading all registers also doesn't work. However, reading individual registers works.
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:30 ERROR: Unable to read status (-110)
You're getting -ETIMEDOUT. An mdio-netlink program will run with a default timeout of 100ms, which should be more than enough time to read the 16 registers that your command should trigger.
Is this an unusually slow bus? Bit-banged I2C?
Do you get the same error with mdio i2c:sfp2 mmd 17:30 dump 0+15, what about mdio i2c:sfp2 mmd 17:30 dump 0+7 , mdio i2c:sfp2 mmd 17:30 dump 0+3 etc?
Hand-on-heart: probably not until I find myself needing it 😄
I'd be happy to accept a PR that adds it though.
Then I'll try to write the missing pieces of code :D
Is this an unusually slow bus? Bit-banged I2C?
The I2C controller is in hardware. My board is a Banana Pi R4 with an MT7988A SoC. There is also an I2C multiplexer between the SFP cage and the SoC.
This is probably an unrelated issue, but I also have an SFP+ module with RTL8261N/RTL8261BE that doesn't work. I noticed that it needs additional delays between commands. This issue is on the kernel side.
Do you get the same error with
mdio i2c:sfp2 mmd 17:30 dump 0+15, what aboutmdio i2c:sfp2 mmd 17:30 dump 0+7,mdio i2c:sfp2 mmd 17:30 dump 0+3etc?
The register group reading looks OK:
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:3 dump 0+15
0x0000: 0x2040
0x0001: 0x0002
0x0002: 0x31c3
0x0003: 0x1c13
0x0004: 0x00c1
0x0005: 0x009a
0x0006: 0xe000
0x0007: 0x0003
0x0008: 0xb009
0x0009: 0x0000
0x000a: 0x0000
0x000b: 0x0000
0x000c: 0x0000
0x000d: 0x0000
0x000e: 0x31c3
0x000f: 0x1c13
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:3 dump 0+7
0x0000: 0x2040
0x0001: 0x0002
0x0002: 0x31c3
0x0003: 0x1c13
0x0004: 0x00c1
0x0005: 0x009a
0x0006: 0xe000
0x0007: 0x0003
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:3 dump 0+3
0x0000: 0x2040
0x0001: 0x0002
0x0002: 0x31c3
0x0003: 0x1c13
PHY accepts multiple MMD pages 1, 3, 4, 7, 29, and 30:
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:1 dump 0x2
0x0002: 0x31c3
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:3 dump 0x2
0x0002: 0x31c3
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:4 dump 0x2
0x0002: 0x31c3
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:7 dump 0x2
0x0002: 0x31c3
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:29 dump 0x2
0x0002: 0x31c3
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:30 dump 0x2
0x0002: 0x31c3
Each of them returns a timeout:
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:1
ERROR: Unable to read status (-110)
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:3
ERROR: Unable to read status (-110)
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:4
ERROR: Unable to read status (-110)
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:7
ERROR: Unable to read status (-110)
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:29
ERROR: Unable to read status (-110)
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:30
ERROR: Unable to read status (-110)
EDIT: I added a bunch of debug printf 's to the kernel, and they can slow down reads.
Is /sys/class/mdio_bus/i2c:sfp2/statistics/errors_17 non-zero? Is that the source of the timeouts, or is it from the timeout in mdio-netlink?
/sys/class/mdio_bus/i2c:sfp2/statistics/errors_17
Version without the debug in the kernel:
root@OpenWrt:~# mdio i2c:sfp2 mmd 17:3
ERROR: Unable to read status (-110)
root@OpenWrt:~# cat /sys/class/mdio_bus/i2c:sfp2/statistics/errors_17
0
Access to registers in SFP modules is very slow. c22 over mdio:
root@OpenWrt:~# mdio mdio-bus 5 bench 0x2
Performed 1000 reads in 28ms
c45 over mdio:
root@OpenWrt:~# mdio mdio-bus 5:3 bench 0x2
Performed 1000 reads in 56ms
c22 over I2C (page 0x56):
root@OpenWrt:~# mdio i2c:sfp2 0x16 bench 0x2
Performed 1000 reads in 544ms
c45 over rollball over i2c:
root@OpenWrt:~# mdio i2c:sfp2 0x11:3 bench 0x2
Benchmark failed after 10.08s
ERROR: Bench operation failed (-110)
I think what's going on is that the MMD status command uses mdio_xfer()...
https://github.com/wkz/mdio-tools/blob/cd8a90801974afc64eabea664f15095b87dc289c/src/mdio/phy.c#L189
...which results in a 1s timeout...
https://github.com/wkz/mdio-tools/blob/cd8a90801974afc64eabea664f15095b87dc289c/src/mdio/mdio.c#L630-L634
...whereas bench and dump both use a 10s timeout:
https://github.com/wkz/mdio-tools/blob/cd8a90801974afc64eabea664f15095b87dc289c/src/mdio/mdio.c#L486
https://github.com/wkz/mdio-tools/blob/cd8a90801974afc64eabea664f15095b87dc289c/src/mdio/mdio.c#L536
The gist of it: your bus slower than mdio expects any bus to be. I am hesitant to increase the timeout for the status command as well. I guess we could take a custom timeout as a flag 🤔