sonic-buildimage icon indicating copy to clipboard operation
sonic-buildimage copied to clipboard

[voq][chassis][dhcp_relay] swss.sh try to start the dhcp_relay service although it is masked

Open mlok-nokia opened this issue 1 year ago • 1 comments

Why I did it

on Master branch, dhcp_relay is not supported in VOQ chassis. It is disabled in the FEATURE table. But based on the dependency, swss.sh always call "systemctl start" it although it's service file has been masked/disabled. The following error is logged in syslog which causes the logAnalyze failed on some of the OC tests. Fix issue https://github.com/sonic-net/sonic-buildimage/issues/18822

Work item tracking
  • Microsoft ADO (number only):

How I did it

Added code to swss.sh to check if service is disabled or not. If it is disabled, do not start the service although it is in the DPENDENT_LIST. This avoids the ERROR log shown up in the syslog file

How to verify it

  1. Reboot the system or execut the config reload. The following error messages should not be seen in the syslog file
Apr 27 23:15:14.472425 ixre-egl-board7 ERR systemctl[7299]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.
Apr 27 23:15:14.476897 ixre-egl-board7 ERR systemctl[7298]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.

Which release branch to backport (provide reason below if selected)

This issue only exist in Master branch which is using the newer version of kernel.

  • [ ] 201811
  • [ ] 201911
  • [ ] 202006
  • [ ] 202012
  • [ ] 202106
  • [ ] 202111
  • [ ] 202205
  • [ ] 202211
  • [ ] 202305

Tested branch (Please provide the tested image version)

tested on Master branch.

  • [ ]
  • [ ]

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

mlok-nokia avatar Apr 30 '24 03:04 mlok-nokia

@kellyyeh please review this at earliest, thanks.

rlhui avatar May 01 '24 17:05 rlhui

@lguohan @kellyyeh Please help to review and merge this PR. Thanks

mlok-nokia avatar May 08 '24 15:05 mlok-nokia

hi @lguohan / @yxieca , could you help merge?

wenyiz2021 avatar May 15 '24 22:05 wenyiz2021

hi @lguohan / @yxieca , could you help merge?

This change is generally in the right direction. I am surprised that it only affected dhcp relay. @qiluo-msft , @saiarcot895 do you have other concerns for the change?

yxieca avatar May 15 '24 22:05 yxieca

hi @lguohan / @yxieca , could you help merge?

This change is generally in the right direction. I am surprised that it only affected dhcp relay. @qiluo-msft , @saiarcot895 do you have other concerns for the change?

we have seen similar issue where sup has bgp disabled but not masked, and swss startup also starts bgp which caused an issue. there was https://github.com/sonic-net/sonic-buildimage/pull/15734 to fix that. current PR fix looks more general

wenyiz2021 avatar May 15 '24 22:05 wenyiz2021

For swss, there's also the docker-wait-any script that gets started as part of the wait command (this basically checks to see if one of the dependent containers have exited; if so, it brings down this container). That might need to be modified as well here?

saiarcot895 avatar May 15 '24 23:05 saiarcot895

For swss, there's also the docker-wait-any script that gets started as part of the wait command (this basically checks to see if one of the dependent containers have exited; if so, it brings down this container). That might need to be modified as well here?

For this particular issue, it only happens on the variable "DEPENDENT". The docker-wait-any is using a variable "MULTI_INST_DEPENDENT". No need to modify for docker-wait-any.

mlok-nokia avatar May 22 '24 16:05 mlok-nokia

@prabhataravind would you be able to review/approve, so that this PR could be merged? thanks.

rlhui avatar Jun 19 '24 17:06 rlhui

@prabhataravind would you be able to review/approve, so that this PR could be merged? thanks.

@rlhui Reviewed and approved the latest diff.

prabhataravind avatar Jun 26 '24 18:06 prabhataravind

Hi @yxieca are we good to merge this PR?

yaqiangz avatar Aug 21 '24 06:08 yaqiangz

Cherry-pick PR to 202405: https://github.com/sonic-net/sonic-buildimage/pull/20182

mssonicbld avatar Sep 06 '24 21:09 mssonicbld