sonic-swss
sonic-swss copied to clipboard
[CoPP] Fix the bug that copp configuration will cause failed logs during warm restart
What I did Let CoppMgr skip configuration which are already in APPL_DB after warm restart.
Why I did it After warm start, the CoppMgr will set configuration into APPL_DB which already exist, but this behavior is not necessary and it will cause CoPPorch to do a modification event.
How I verified it With this enhancement, the syslog will not appear the error messages about Copp during warm restart.
Details if related
Can you please paste the error messages seen after warmboot here?
Dec 9 12:38:52.857966 as7726-32x-4 ERR swss#orchagent: :- meta_generic_validation_set: SAI_POLICER_ATTR_METER_TYPE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
Dec 9 12:38:52.857966 as7726-32x-4 ERR swss#orchagent: :- trapGroupUpdatePolicer: Failed to apply attribute[2].id=0 to policer for trap group:default, error:-5
Dec 9 12:38:52.857966 as7726-32x-4 ERR swss#orchagent: :- doTask: Processing copp task item failed, exiting.
@dgsudharsan , could you please check the latest comment?
According to existing logic we delete the APP_DB will be cleared during an upgrade which includes warm-upgrade scenario. Without the APP_DB being set the feature wouldn't work during warm upgrades. Please revisit the logic to make sure it works for both warm restart and warm upgrade.
Please check here
https://github.com/Azure/sonic-utilities/blob/d9f3afe5b34ef0f03f4137f607bc73bc625631ed/scripts/db_migrator.py#L166
Dec 9 12:38:52.857966 as7726-32x-4 ERR swss#orchagent: :- doTask: Processing copp task item failed, exiting.
@chaoskao Can you please explain the entire scenario. I tried with default copp configurations and I don't see any issues. Are you adding additional Copp configurations?
@dgsudharsan I following the test steps in https://github.com/Azure/sonic-mgmt/blob/master/tests/vrf/test_vrf.py#L1020, enable the swss warm reboot and restart the swss container by CLI command.
The CoPP configuration shows as following
{ "COPP_GROUP": { "default": { "queue": "0", "meter_type":"packets", "mode":"sr_tcm", "cir":"600", "cbs":"600", "red_action":"drop" } } }
@dgsudharsan I following the test steps in https://github.com/Azure/sonic-mgmt/blob/master/tests/vrf/test_vrf.py#L1020, enable the swss warm reboot and restart the swss container by CLI command.
The CoPP configuration shows as following
{ "COPP_GROUP": { "default": { "queue": "0", "meter_type":"packets", "mode":"sr_tcm", "cir":"600", "cbs":"600", "red_action":"drop" } } }
I believe the usecase is swss warm-restart. Can you please check if warm-reboot itself is impacted? If swss warm-restart we may have to deal it differently.
@dgsudharsan warm-reboot will not show the error message, the error log only shows in syslog when service swss restart
@dgsudharsan warm-reboot will not show the error message, the error log only shows in syslog when service swss restart
With your changes warmboot and warmupgrade will be broken. A suggestion would be to check if app_db is populated already and ignore setting app db. Another suggestion is to remove app_db entries as in warm reboot @prsunny Can you please suggest your opinion?
@dgsudharsan warm-reboot will not show the error message, the error log only shows in syslog when service swss restart
With your changes warmboot and warmupgrade will be broken. A suggestion would be to check if app_db is populated already and ignore setting app db. Another suggestion is to remove app_db entries as in warm reboot @prsunny Can you please suggest your opinion?
Do you means check the each attribute of CoPP rules from App_DB, if attribute already exist just ignore this attribute or just check the key of rules?
Do you means check the each attribute of CoPP rules from App_DB, if attribute already exist just ignore this attribute or just check the key of rules?
You can check the key alone within warm restart check that you have introduced. You can use m_coppTable for this purpose.
@dgsudharsan warm-reboot will not show the error message, the error log only shows in syslog when service swss restart
With your changes warmboot and warmupgrade will be broken. A suggestion would be to check if app_db is populated already and ignore setting app db. Another suggestion is to remove app_db entries as in warm reboot @prsunny Can you please suggest your opinion?
Looks like the change is impacting regular warmboot. As such, we cannot have this change merged. Can you please check on the options provided by @dgsudharsan.
The modification is just for the warm-restart case which will set a "swss warm restart" flag in STATUS_DB. I want to avoid the case in which the coppmgr will try to set the same configuration from config_DB into APPL_DB which is already existed during warm restart. When user enable "warm_restart" on swss and APPL_DB does not be cleared, this code will skipped the CoPP configuration which from config_DB to APPL_DB
In the db_migratror case mentioned by @dgsudharsan, the enhancement will still apply the configuration from copp_cfg.json into APPL_DB because there is no entry in APPL_DB which is cleared in db_migrator.
The modification is just for the warm-restart case which will set a "swss warm restart" flag in STATUS_DB. I want to avoid the case in which the coppmgr will try to set the same configuration from config_DB into APPL_DB which is already existed during warm restart. When user enable "warm_restart" on swss and APPL_DB does not be cleared, this code will skipped the CoPP configuration which from config_DB to APPL_DB
In the db_migratror case mentioned by @dgsudharsan, the enhancement will still apply the configuration from copp_cfg.json into APPL_DB because there is no entry in APPL_DB which is cleared in db_migrator.
Have you tested this scenario? Can you please perform a warm boot with your fix and check if entries are populated after regular warm boot?
I try the warm start scenario and the test steps as following: step 1. sudo config warm_restart enable swss step 2. db_migrator.py -o migrate step 3. sudo service swss restart
on step 2, I check APPL_DB has been cleared, and after the step3 the entries of APPL_DB has been populated again
Another warm reboot scenario has same result, the test steps as following:
step 1. sudo config warm_restart enable swss step 2. db_migrator.py -o migrate step 3. sudo warm-reboot
I try the warm start scenario and the test steps as following: step 1. sudo config warm_restart enable swss step 2. db_migrator.py -o migrate step 3. sudo service swss restart
on step 2, I check APPL_DB has been cleared, and after the step3 the entries of APPL_DB has been populated again
Another warm reboot scenario has same result, the test steps as following:
step 1. sudo config warm_restart enable swss step 2. db_migrator.py -o migrate step 3. sudo warm-reboot
If using db_migrator helps should we change sonic-mgmt test script rather than changing code here?
Hi, @dgsudharsan
Sorry, I don't know what are you mean, I just simulate the scenario that you say the entries will not be populated due to this code change, so I use db_migrator.py to clear CoPP setting from Appl_DB to check does this issue exist or not.
Hi, @dgsudharsan Sorry, I don't know what are you mean, I just simulate the scenario that you say the entries will not be populated due to this code change, so I use db_migrator.py to clear CoPP setting from Appl_DB to check does this issue exist or not.
I believe after using DB migrator you don't see the issue correct? If that's the case we need to use DB migrator logic in the test scripts to avoid the CoPP error logs. @prsunny what is your opinion on this?