sonic-swss
sonic-swss copied to clipboard
orchagent/portsorch: Missing scheduler group after SWSS restart
What I did Added function to query scheduler group objects during SWSS restart.
Why I did it Scheduler group objects were removed as they were missing in temp view during SWSS restart. The triggers to reproduce this issue:
sudo config warm_restart enable swss
sudo service swss restart
How I verified it Verified scheduler group objects are not removed after the fix and are part of ASIC_DB.
Details if related Mar 1 22:46:39.015724 sonic NOTICE swss#orchagent: :- setWarmStartState: orchagent warm start state changed to restored Mar 1 22:46:39.015724 sonic NOTICE swss#orchagent: :- warmRestoreAndSyncUp: Orchagent state restore done Mar 1 22:46:39.015724 sonic NOTICE swss#orchagent: :- syncd_apply_view: Notify syncd APPLY_VIEW Mar 1 22:46:39.015724 sonic NOTICE swss#orchagent: :- notifySyncd: sending syncd: APPLY_VIEW Mar 1 22:46:39.016078 sonic WARNING syncd[24]: :- processNotifySyncd: syncd received APPLY VIEW, will translate Mar 1 22:46:39.155799 sonic NOTICE syncd[24]: :- dump: getting took 0.139421 sec Mar 1 22:46:39.172973 sonic NOTICE syncd[24]: :- getAsicView: ASIC_STATE switch count: 1: Mar 1 22:46:39.172973 sonic NOTICE syncd[24]: :- getAsicView: oid:0x21000000000000: objects count: 7529 Mar 1 22:46:39.175515 sonic NOTICE syncd[24]: :- getAsicView: get asic view from ASIC_STATE took 0.159394 sec Mar 1 22:46:39.304183 sonic NOTICE syncd[24]: :- dump: getting took 0.128388 sec Mar 1 22:46:39.318381 sonic NOTICE syncd[24]: :- getAsicView: TEMP_ASIC_STATE switch count: 1: Mar 1 22:46:39.318381 sonic NOTICE syncd[24]: :- getAsicView: oid:0x21000000000000: objects count: 7174 Mar 1 22:46:39.320676 sonic NOTICE syncd[24]: :- getAsicView: get asic view from TEMP_ASIC_STATE took 0.145098 sec Mar 1 22:46:39.412827 sonic NOTICE syncd[24]: :- ComparisonLogic: srand seed for switch oid:0x21000000000000: 1646174799 Mar 1 22:46:39.413412 sonic NOTICE syncd[24]: :- matchOids: matched oids Mar 1 22:46:39.413412 sonic NOTICE syncd[24]: :- populateExistingObjects: populate existing objects Mar 1 22:46:39.414330 sonic NOTICE syncd[24]: :- checkInternalObjects: check internal objects Mar 1 22:46:39.414519 sonic WARNING syncd[24]: :- checkInternalObjects: different number of objects SAI_OBJECT_TYPE_SCHEDULER_GROUP, curr: 321, tmp 33 (not expected if warm boot) Mar 1 22:46:39.414519 sonic ERR syncd[24]: :- checkInternalObjects: object status is not MATCHED on curr: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000005e Mar 1 22:46:39.414519 sonic ERR syncd[24]: :- checkInternalObjects: object status is not MATCHED on curr: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000005f Mar 1 22:46:39.414519 sonic ERR syncd[24]: :- checkInternalObjects: object status is not MATCHED on curr: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x17000000000060 .. .. Mar 1 22:46:39.421248 sonic ERR syncd[24]: :- checkInternalObjects: object status is not MATCHED on curr: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000034e Mar 1 22:46:39.439937 sonic NOTICE syncd[24]: :- createPreMatchMap: preMatch map size: 102, tmp oid obj: 186 Mar 1 22:46:39.439937 sonic NOTICE syncd[24]: :- createPreMatchMap: create preMatch map took 0.023570 sec Mar 1 22:46:39.441150 sonic WARNING syncd[24]: :- logViewObjectCount: object count for SAI_OBJECT_TYPE_SCHEDULER_GROUP on current view 321 is different than on temporary view: 33 Mar 1 22:46:39.443028 sonic WARNING syncd[24]: :- logViewObjectCount: object count is different on both view, there will be ASIC OPERATIONS! Mar 1 22:46:39.443028 sonic NOTICE syncd[24]: :- checkMatchedPorts: all ports are matched Mar 1 22:46:39.443082 sonic WARNING syncd[24]: :- performObjectSetTransition: current attr is CREATE_ONLY and object is MATCHED: oid:0x1000000000050 transferring SAI_PORT_ATTR_HW_LANE_LIST:4:0,1,2,3 to temp object Mar 1 22:46:39.443150 sonic WARNING syncd[24]: :- performObjectSetTransition: current attr is CREATE_ONLY and object is MATCHED: oid:0x1000000000068 transferring SAI_PORT_ATTR_HW_LANE_LIST:4:4,5,6,7 to temp object .. .. Mar 1 22:46:39.444210 sonic WARNING syncd[24]: :- performObjectSetTransition: current attr is CREATE_ONLY and object is MATCHED: oid:0x1000000000338 transferring SAI_PORT_ATTR_HW_LANE_LIST:4:124,125,126,127 to temp object Mar 1 22:46:39.516195 sonic NOTICE syncd[24]: :- applyViewTransition: loop removed 288 objects Mar 1 22:46:39.516680 sonic NOTICE syncd[24]: :- applyViewTransition: comparison logic took 0.073610 sec Mar 1 22:46:39.516680 sonic NOTICE syncd[24]: :- transferNotProcessed: calling transferNotProcessed Mar 1 22:46:39.517046 sonic NOTICE syncd[24]: :- compareViews: ASIC operations to execute: 288 Mar 1 22:46:39.517992 sonic NOTICE syncd[24]: :- compareViews: all temporary view objects were processed to FINAL state Mar 1 22:46:39.518395 sonic NOTICE syncd[24]: :- compareViews: all current view objects were processed to FINAL state Mar 1 22:46:39.518395 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: operations to execute on ASIC: 288 Mar 1 22:46:39.518416 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: NOT optimized operations Mar 1 22:46:39.518451 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: remove: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000005e Mar 1 22:46:39.518451 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: remove: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000005f .. .. Mar 1 22:46:39.527085 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: operations on SAI_OBJECT_TYPE_SCHEDULER_GROUP: 288 Mar 1 22:46:39.527085 sonic NOTICE syncd[24]: :- asicGetWithOptimizedRemoveOperations: moved 288 REMOVE operations upper in stack from total 288 operations Mar 1 22:46:39.527118 sonic NOTICE syncd[24]: :- asicGetWithOptimizedRemoveOperations: optimizing asic remove operations took 0.000160 sec Mar 1 22:46:39.530969 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: asic apply took 0.012517 sec Mar 1 22:46:39.530969 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: performed all operations on asic successfully Mar 1 22:46:39.589620 sonic NOTICE syncd[24]: :- threadFunction: time span 573 ms for 'notify:APPLY_VIEW'
why is this? "Scheduler group objects were removed as they were missing in temp view during SWSS restart."
why is this? "Scheduler group objects were removed as they were missing in temp view during SWSS restart."
As we spoke on sync up, this is related to 2nd restart of OA while syncd is still running, this is more like a syncd/comparison logic design issue.
This issue seems the same case of https://github.com/Azure/sonic-sairedis/pull/994
This issue seems the same case of Azure/sonic-sairedis#994
yea, seems like this will also solve the problem
any progress here? could this be moved out of draft and merged?
@lguohan can this PR be merged? Are there any additional items that @arvbb needs to take care of
Hi @neethajohn, can this PR be merged? Thanks
@neethajohn , could you please review/signoff?