sonic-buildimage
sonic-buildimage copied to clipboard
[Chassis Cisco 8800] vlanmgrd can't up due to portchannel not up after config reload
Description
Steps to reproduce the issue:
It's a flaky issue, continues config reload can reproduce it:
Describe the results you received:
After config reload, a portchannel down, and vlanmgrd can't up with error log: "vlanmgrd Cannot find device PortChannel**"
Describe the results you expected:
After config reload, vlanmgrd can be up always
Output of show version
:
"SONiC Software Version: SONiC.internal-202405.106004063-6a8cbc2250", "SONiC OS Version: 12", "Distribution: Debian 12.6", "Kernel: 6.1.0-22-2-amd64", "Build commit: 6a8cbc2250", "Build date: Thu Oct 17 19:42:45 UTC 2024", "Built by: azureuser@19bdbc57c000000", "", "Platform: x86_64-88_lc0_36fh_mo-r0", "HwSKU: Cisco-88-LC0-36FH-M-O36", "ASIC: cisco-8000", "ASIC Count: 3", "Serial Number: FOC2533NMVZ", "Model Number: 88-LC0-36FH-MO", "Hardware Revision: 1.0", "Uptime: 14:27:21 up 4:02, 2 users, load average: 2.53, 5.36, 6.78", "Date: Thu 24 Oct 2024 14:27:21", "", "Docker images:", "REPOSITORY TAG IMAGE ID SIZE", "docker-platform-monitor internal-202405.106004063-6a8cbc2250 d77fc8674af4 443MB", "docker-platform-monitor latest d77fc8674af4 443MB", "docker-sonic-telemetry internal-202405.106004063-6a8cbc2250 3e1d8e8ba432 382MB", "docker-sonic-telemetry latest 3e1d8e8ba432 382MB", "docker-orchagent internal-202405.106004063-6a8cbc2250 a3f487f5ed08 357MB", "docker-orchagent latest a3f487f5ed08 357MB", "docker-fpm-frr internal-202405.106004063-6a8cbc2250 298ec570250e 369MB", "docker-fpm-frr latest 298ec570250e 369MB", "docker-macsec latest 247fc78f2696 347MB", "docker-dhcp-relay latest 9235b66d9447 327MB", "docker-snmp internal-202405.106004063-6a8cbc2250 1231ba6ee5eb 355MB", "docker-snmp latest 1231ba6ee5eb 355MB", "docker-teamd internal-202405.106004063-6a8cbc2250 02ac5694af56 344MB", "docker-teamd latest 02ac5694af56 344MB", "docker-router-advertiser internal-202405.106004063-6a8cbc2250 96d13d355508 316MB", "docker-router-advertiser latest 96d13d355508 316MB", "docker-sonic-restapi internal-202405.106004063-6a8cbc2250 8136b98201db 334MB", "docker-sonic-restapi latest 8136b98201db 334MB", "docker-mux internal-202405.106004063-6a8cbc2250 cd8914270d22 368MB", "docker-mux latest cd8914270d22 368MB", "docker-lldp internal-202405.106004063-6a8cbc2250 2903bb97d61e 361MB", "docker-lldp latest 2903bb97d61e 361MB", "docker-sonic-gnmi internal-202405.106004063-6a8cbc2250 13ea4346ddf7 381MB", "docker-sonic-gnmi latest 13ea4346ddf7 381MB", "docker-eventd internal-202405.106004063-6a8cbc2250 0bfcddab4465 316MB", "docker-eventd latest 0bfcddab4465 316MB", "docker-database internal-202405.106004063-6a8cbc2250 02fc9b9bd2e7 324MB", "docker-database latest 02fc9b9bd2e7 324MB", "docker-vnet-monitor internal-202405.106004063-6a8cbc2250 353a0a25ab21 326MB", "docker-vnet-monitor latest 353a0a25ab21 326MB", "docker-ipxeserver-cisco internal-202405.106004063-6a8cbc2250 d3cb05aecfe3 345MB", "docker-ipxeserver-cisco latest d3cb05aecfe3 345MB", "docker-syncd-cisco internal-202405.106004063-6a8cbc2250 8f977ee7389b 1.08GB", "docker-syncd-cisco latest 8f977ee7389b 1.08GB", "docker-gbsyncd-cisco internal-202405.106004063-6a8cbc2250 8cb76b4fb3b9 371MB", "docker-gbsyncd-cisco latest 8cb76b4fb3b9 371MB", "docker-acms internal-202405.106004063-6a8cbc2250 adb717aa470b 350MB", "docker-acms latest adb717aa470b 350MB",
(paste your output here)
Additional information you deem important (e.g. issue happens only occasionally):
- The RP is config reloaded on 17:51:17
2024 Oct 23 17:50:42.538549 str3-8800-sup-1 INFO python[469440]: ansible-ansible.legacy.command Invoked with executable=/bin/bash _raw_params=config reload -h _uses_shell=True warn=False stdin_add_newline=True strip_empty_ends=True argv=None chdir=None cre ates=None removes=None stdin=None 2024 Oct 23 17:51:17.352509 str3-8800-sup-1 INFO python[470756]: ansible-ansible.legacy.command Invoked with executable=/bin/bash raw
- sudo grep vlanmgrd tmp/syslog.4
2024 Oct 23 17:53:36.261126 str3-8800-sup-1 NOTICE swss0#vlanmgrd: :- main: starting main loop 2024 Oct 23 17:53:36.465270 str3-8800-sup-1 NOTICE swss11#vlanmgrd: :- main: starting main loop 2024 Oct 23 17:53:36.493126 str3-8800-sup-1 NOTICE swss12#vlanmgrd: :- main: starting main loop 2024 Oct 23 17:53:37.925112 str3-8800-sup-1 INFO swss5#supervisord: vlanmgrd Cannot find device "PortChannel82" 2024 Oct 23 17:53:37.927362 str3-8800-sup-1 ERR swss5#vlanmgrd: :- main: Runtime error: /bin/bash -c "/sbin/ip link set "PortChannel82" master Bridge && /sbin/bridge vlan del vid 1 dev "PortChannel82" && /sbin/bridge vlan add vid 2 dev "PortChannel82" pvid untagged" : 2024 Oct 23 17:53:37.953116 str3-8800-sup-1 INFO swss5#supervisord 2024-10-23 17:53:37,948 WARN exited: vlanmgrd (exit status 255; not expected)
- admin@str3-8800-sup-1:~$ sudo grep PortChannel82 tmp/syslog.4
2024 Oct 23 17:51:20.955337 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: Cannot find PortChannel82 in port table 2024 Oct 23 17:51:20.955337 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:1 oper:0 addr:aa:00:04:00:00:07 ifindex:60 master:62 type:team 2024 Oct 23 17:51:21.237668 str3-8800-sup-1 NOTICE swss5#orchagent: :- updatePortOperStatus: Port PortChannel82 oper state set from up to down 2024 Oct 23 17:51:27.926494 str3-8800-sup-1 NOTICE teamd5#tlm_teamd: :- ~TeamdCtlMgr: Exiting. Disconnecting from teamd. LAG 'PortChannel82' 2024 Oct 23 17:51:29.241707 str3-8800-sup-1 NOTICE teamd5#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel82 pid 38 2024 Oct 23 17:51:29.302234 str3-8800-sup-1 ERR swss5#orchagent: :- removeLag: Failed to remove non-empty LAG PortChannel82 2024 Oct 23 17:51:29.304136 str3-8800-sup-1 ERR swss5#orchagent: :- removeLag: Failed to remove non-empty LAG PortChannel82 2024 Oct 23 17:51:29.305480 str3-8800-sup-1 INFO kernel: [ 5400.283343] PortChannel82: Port device Ethernet-BP2484 removed 2024 Oct 23 17:51:29.307469 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:1 oper:0 addr:aa:00:04:00:00:07 ifindex:60 master:62 type:team 2024 Oct 23 17:51:29.308508 str3-8800-sup-1 ERR swss5#orchagent: :- removeLag: Failed to remove non-empty LAG PortChannel82 2024 Oct 23 17:51:29.308993 str3-8800-sup-1 ERR swss5#orchagent: :- removeLag: Failed to remove non-empty LAG PortChannel82 2024 Oct 23 17:51:29.309615 str3-8800-sup-1 ERR swss5#orchagent: :- removeLag: Failed to remove non-empty LAG PortChannel82 2024 Oct 23 17:51:29.310005 str3-8800-sup-1 ERR swss5#orchagent: :- removeLag: Failed to remove non-empty LAG PortChannel82 2024 Oct 23 17:51:29.342253 str3-8800-sup-1 ERR swss5#orchagent: :- removeLag: Failed to remove non-empty LAG PortChannel82 2024 Oct 23 17:51:29.350327 str3-8800-sup-1 ERR swss5#orchagent: :- removeLag: Failed to remove non-empty LAG PortChannel82 2024 Oct 23 17:51:29.361296 str3-8800-sup-1 INFO kernel: [ 5400.341023] PortChannel82: Port device Ethernet-BP2480 removed 2024 Oct 23 17:51:29.406896 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:1 oper:0 addr:aa:00:04:00:00:07 ifindex:60 master:62 type:team 2024 Oct 23 17:51:29.529549 str3-8800-sup-1 NOTICE swss5#orchagent: :- removeLagMember: Remove member Ethernet-BP2480 from LAG PortChannel82 lid:2000000000766 lmid:1b000000000772 2024 Oct 23 17:51:29.613760 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:0 oper:0 addr:aa:00:04:00:00:07 ifindex:60 master:62 type:team ...
2024 Oct 23 17:53:44.404242 str3-8800-sup-1 NOTICE teamd5#teammgrd: :- addLag: Start port channel PortChannel82 with teamd 2024 Oct 23 17:53:44.415867 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:1 oper:0 addr:aa:00:04:00:00:07 ifindex:87 master:0 type:team 2024 Oct 23 17:53:44.418001 str3-8800-sup-1 INFO kernel: [ 5535.395081] 8021q: adding VLAN 0 to HW filter on device PortChannel82 2024 Oct 23 17:53:44.429133 str3-8800-sup-1 NOTICE teamd5#teammgrd: :- setLagAdminStatus: Set port channel PortChannel82 admin status to up 2024 Oct 23 17:53:44.445066 str3-8800-sup-1 NOTICE teamd5#teammgrd: :- setLagMtu: Set port channel PortChannel82 MTU to 9100 2024 Oct 23 17:53:44.445066 str3-8800-sup-1 NOTICE teamd5#teammgrd: :- setLagTpid: Set port channel PortChannel82 TPID to 0x8100 2024 Oct 23 17:53:44.445066 str3-8800-sup-1 NOTICE teamd5#teammgrd: :- doLagTask: Configure PortChannel82 TPID to 0x8100 2024 Oct 23 17:53:44.485866 str3-8800-sup-1 NOTICE teamd5#tlm_teamd: :- try_add_lag: The LAG 'PortChannel82' has been added. 2024 Oct 23 17:54:13.701721 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:1 oper:0 addr:aa:00:04:00:00:07 ifindex:87 master:0 type:team 2024 Oct 23 17:54:46.765071 str3-8800-sup-1 INFO kernel: [ 5597.747083] PortChannel82: Port device Ethernet-BP2480 added 2024 Oct 23 17:54:46.766076 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:1 oper:0 addr:aa:00:04:00:00:07 ifindex:87 master:0 type:team 2024 Oct 23 17:54:46.843059 str3-8800-sup-1 NOTICE teamd5#teammgrd: :- addLagMember: Add Ethernet-BP2480 to port channel PortChannel82 2024 Oct 23 17:54:47.929185 str3-8800-sup-1 INFO kernel: [ 5598.911424] PortChannel82: Port device Ethernet-BP2484 added 2024 Oct 23 17:54:47.929386 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:1 oper:0 addr:aa:00:04:00:00:07 ifindex:87 master:0 type:team 2024 Oct 23 17:54:47.937455 str3-8800-sup-1 NOTICE teamd5#teammgrd: :- addLagMember: Add Ethernet-BP2484 to port channel PortChannel82 2024 Oct 23 17:54:57.303052 str3-8800-sup-1 NOTICE swss5#orchagent: :- addLag: Create an empty LAG PortChannel82 lid:2000000000766 2024 Oct 23 17:54:57.303360 str3-8800-sup-1 NOTICE swss5#orchagent: :- updatePortOperStatus: Port PortChannel82 oper state set from unknown to down 2024 Oct 23 17:54:57.433640 str3-8800-sup-1 NOTICE swss5#orchagent: :- addLagMember: Add member Ethernet-BP2480 to LAG PortChannel82 lid:2000000000766 pid:1000000000024 2024 Oct 23 17:54:57.446576 str3-8800-sup-1 NOTICE swss5#orchagent: :- addLagMember: Add member Ethernet-BP2484 to LAG PortChannel82 lid:2000000000766 pid:1000000000025 2024 Oct 23 17:55:45.390718 str3-8800-sup-1 NOTICE swss5#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel82 admin:1 oper:1 addr:aa:00:04:00:00:07 ifindex:87 master:0 type:team 2024 Oct 23 17:55:45.393162 str3-8800-sup-1 INFO kernel: [ 5656.374093] IPv6: ADDRCONF(NETDEV_CHANGE): PortChannel82: link becomes ready 2024 Oct 23 17:55:45.523011 str3-8800-sup-1 NOTICE swss5#orchagent: :- updatePortOperStatus: Port PortChannel82 oper state set from down to up admin@str3-8800-sup-1:~$