sonic-platform-daemons
sonic-platform-daemons copied to clipboard
Enhance/fix media_settings infra for 100G QSFP28 and DPB etc
Description
Enhance/fix media_settings infra in below aspects:
-
Support/fix for 100G QSFP28 transceivers:
- fix its issue of
media_key
parsed asQSFP28-Unknown-...
due to its compliance code defined inExtended Specification Compliance
field rather than10/40G Ethernet Compliance Code
Example transceiver info for 100G QSFP28: root@sonic:/home/cisco# show int trans info Ethernet176 Ethernet176: SFP EEPROM detected Application Advertisement: N/A ... Identifier: QSFP28 or later ... Specification compliance: 10/40G Ethernet Compliance Code: Unknown Extended Specification Compliance: 100GBASE-CR4, 25GBASE-CR CA-25G-L or 50GBASE-CR2 with RS ...
Solution: Go check
Extended Specification Compliance
for QSFP28 100G modules - fix its issue of
lane_speed_key
parsed asNone
due to QSFP28 having noApplication Advertisement
(which is CMIS specific field and containshost_electrical_interface_id
used by today's logic as speed key )Solution: For non-CMIS, directly use
port_speed
andlane_count
to calculate lane speed and use it as key, e.g. 100G / 4 = 25G, then lane speed key isspeed:25G
- fix its issue of
-
Support/fix for DPB situations:
- fix the issue that serdes SI values for wrong lanes get picked up from media_settings.json, due to below two mistakes made by today's logic of get_media_val_str():
- Mistake 1: it relies on num_logical_ports obtained from len(port_mapping.physical_to_logical[port_idx]) to calculate num_lanes_per_logical_port, which is wrong because num_logical_ports doesn't always reflect the actual number of logical ports under each physical port especially when logical ports are created dynamically at runtime and port_mapping.physical_to_logical[port_idx] gets expanded one logical port at a time via handle_port_change_event()
Problem example: root@sonic:/home/cisco# config interface breakout Ethernet176 "4x25G" -fy root@sonic:/home/cisco# show int status | grep -E "Ethernet17[6-9]" Ethernet176 20 25G 9100 N/A etp44a routed up up 100GBASE-CR4 N/A Ethernet177 21 25G 9100 N/A etp44b routed up up 100GBASE-CR4 N/A Ethernet178 22 25G 9100 N/A etp44c routed up up 100GBASE-CR4 N/A Ethernet179 23 25G 9100 N/A etp44d routed up up 100GBASE-CR4 N/A >>-- port_mapping.handle_port_change_event() called for Ethernet176, and got inserted to port_mapping.physical_to_logical[44] Aug 18 04:46:51.967840 sonic NOTICE pmon#xcvrd[151847]: Publishing ASIC-side SI setting for port Ethernet176 (num_logical_ports=1, logical_idx=0) in APP_DB: Aug 18 04:46:51.967840 sonic NOTICE pmon#xcvrd[151847]: 0:(main,0x1a,0x1b,0x1c,0x1d) --> should be (main,0x1a) instead ...... >>-- port_mapping.handle_port_change_event() called for Ethernet177, and got inserted to port_mapping.physical_to_logical[44] Aug 18 04:46:52.027793 sonic NOTICE pmon#xcvrd[151847]: Publishing ASIC-side SI setting for port Ethernet177 (num_logical_ports=2, logical_idx=1) in APP_DB: Aug 18 04:46:52.027818 sonic NOTICE pmon#xcvrd[151847]: 0:(main,0x1c,0x1d) --> should be (main,0x1b) instead ...... >>-- port_mapping.handle_port_change_event() called for Ethernet178, and got inserted to port_mapping.physical_to_logical[44] Aug 18 04:46:52.085544 sonic NOTICE pmon#xcvrd[151847]: Publishing ASIC-side SI setting for port Ethernet178 (num_logical_ports=3, logical_idx=2) in APP_DB: Aug 18 04:46:52.085544 sonic NOTICE pmon#xcvrd[151847]: 0:(main,0x1c) ...... >>-- port_mapping.handle_port_change_event() called for Ethernet179, and got inserted to port_mapping.physical_to_logical[44] Aug 18 04:46:52.142960 sonic NOTICE pmon#xcvrd[151847]: Publishing ASIC-side SI setting for port Ethernet179 (num_logical_ports=4, logical_idx=3) in APP_DB: Aug 18 04:46:52.142960 sonic NOTICE pmon#xcvrd[151847]: 0:(main,0x1d) ......
Solution: Use
lane_count
per logical port directly obtained from 'lanes' field in config DB port table- Mistake 2: it calculates logical_port_idx based on the order of logical ports inserted into port_mapping.physical_to_logical[port_idx] and use this logical_port_idx to calculate the start_lane to pick up the serdes SI values for this logical port from media_settings.json, but logical port config notification can come in random order which can be different from the actual index of logical port and leads to wrong start_lane calculation and wrong serdes SI values.
Problem example: Upon xcvrd coming up (system bootup/process restart), here Ethernet176 is the real 1st logical port, but last one inserted into port_mapping.logical_port_list, thus wrongly treated as 4th logical port: Aug 18 04:32:21.155536 sonic NOTICE pmon#xcvrd[151847]: Publishing ASIC-side SI setting for port Ethernet177 (num_logical_ports=4, logical_idx=0) in APP_DB: Aug 18 04:32:21.155536 sonic NOTICE pmon#xcvrd[151847]: 0:(main,0x1b) Aug 18 04:32:21.830787 sonic NOTICE pmon#xcvrd[151847]: Publishing ASIC-side SI setting for port Ethernet178 (num_logical_ports=4, logical_idx=1) in APP_DB: Aug 18 04:32:21.830855 sonic NOTICE pmon#xcvrd[151847]: 0:(main,0x1c) Aug 18 04:32:21.895498 sonic NOTICE pmon#xcvrd[151847]: Publishing ASIC-side SI setting for port Ethernet179 (num_logical_ports=4, logical_idx=2) in APP_DB: Aug 18 04:32:21.895537 sonic NOTICE pmon#xcvrd[151847]: 0:(main,0x1d) Aug 18 04:32:22.166040 sonic NOTICE pmon#xcvrd[151847]: Publishing ASIC-side SI setting for port Ethernet176 (num_logical_ports=4, logical_idx=3) in APP_DB: Aug 18 04:32:22.166121 sonic NOTICE pmon#xcvrd[151847]: 0:(main,0x1a)
Solution: Use
subport
number directly obtained from config DB port table as index of logical port (nowadays subport will always get automatically populated)
- fix the issue that serdes SI values for wrong lanes get picked up from media_settings.json, due to below two mistakes made by today's logic of get_media_val_str():
-
Add regular expression support for
lane_speed_key
, so that multiple lane speed keys can be grouped together if they share the same lane speed value or same serdes SI values, e.g.speed:200GAUI-8|100GAUI-4|50GAUI-2|25G
-
Add
lane_speed_key
support underDefault
vendor/media key. Also add support forspeed:Default
, which is useful if default serdes SI setting value is desired when no match is found for available lane speed keys.Example: { 'GLOBAL_MEDIA_SETTINGS': { '0-31': { 'Default': { 'speed:400GAUI-8': {'idriver': {'lane0': '0x1a', ...}, ...}, 'speed:200GAUI-8|100GAUI-4|50GAUI-2|25G': {'idriver': {'lane0': '0x1b', ...}, ...}, 'speed:Default': {'idriver': {'lane0': '0x1c', ...}, ...}, } }, } }
-
Improved code coverage of media_settings_parser.py to 97%
Motivation and Context
This PR is mainly to make sure media_settings infra can work properly for 100G QSFP28 and DPB cases/etc
How Has This Been Tested?
Verified proper settings got notified with different transceivers under both DPB and non-DPB cases Verified compatibility with existing media_settings.json