[voq][chassis][dhcp_relay] swss.sh try to start the dhcp_relay service although it is masked
Why I did it
on Master branch, dhcp_relay is not supported in VOQ chassis. It is disabled in the FEATURE table. But based on the dependency, swss.sh always call "systemctl start" it although it's service file has been masked/disabled. The following error is logged in syslog which causes the logAnalyze failed on some of the OC tests. Fix issue https://github.com/sonic-net/sonic-buildimage/issues/18822
Work item tracking
- Microsoft ADO (number only):
How I did it
Added code to swss.sh to check if service is disabled or not. If it is disabled, do not start the service although it is in the DPENDENT_LIST. This avoids the ERROR log shown up in the syslog file
How to verify it
- Reboot the system or execut the config reload. The following error messages should not be seen in the syslog file
Apr 27 23:15:14.472425 ixre-egl-board7 ERR systemctl[7299]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.
Apr 27 23:15:14.476897 ixre-egl-board7 ERR systemctl[7298]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.
Which release branch to backport (provide reason below if selected)
This issue only exist in Master branch which is using the newer version of kernel.
- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305
Tested branch (Please provide the tested image version)
tested on Master branch.
- [ ]
- [ ]
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)
@kellyyeh please review this at earliest, thanks.
@lguohan @kellyyeh Please help to review and merge this PR. Thanks
hi @lguohan / @yxieca , could you help merge?
hi @lguohan / @yxieca , could you help merge?
This change is generally in the right direction. I am surprised that it only affected dhcp relay. @qiluo-msft , @saiarcot895 do you have other concerns for the change?
hi @lguohan / @yxieca , could you help merge?
This change is generally in the right direction. I am surprised that it only affected dhcp relay. @qiluo-msft , @saiarcot895 do you have other concerns for the change?
we have seen similar issue where sup has bgp disabled but not masked, and swss startup also starts bgp which caused an issue. there was https://github.com/sonic-net/sonic-buildimage/pull/15734 to fix that. current PR fix looks more general
For swss, there's also the docker-wait-any script that gets started as part of the wait command (this basically checks to see if one of the dependent containers have exited; if so, it brings down this container). That might need to be modified as well here?
For swss, there's also the
docker-wait-anyscript that gets started as part of thewaitcommand (this basically checks to see if one of the dependent containers have exited; if so, it brings down this container). That might need to be modified as well here?
For this particular issue, it only happens on the variable "DEPENDENT". The docker-wait-any is using a variable "MULTI_INST_DEPENDENT". No need to modify for docker-wait-any.
@prabhataravind would you be able to review/approve, so that this PR could be merged? thanks.
@prabhataravind would you be able to review/approve, so that this PR could be merged? thanks.
@rlhui Reviewed and approved the latest diff.
Hi @yxieca are we good to merge this PR?
Cherry-pick PR to 202405: https://github.com/sonic-net/sonic-buildimage/pull/20182