sonic-buildimage icon indicating copy to clipboard operation
sonic-buildimage copied to clipboard

T2-VOQ-Chassis: VS support

Open deepak-singhal0408 opened this issue 10 months ago • 14 comments

Why I did it

Changes to ensure sonic_vs.img has everything required for it to be emulated as T2-VOQ-Chassis: Supervisor/Linecard.. With this change,

  1. The image could be used to emulate Supervisor or Linecard
  2. the image could be used to emulate different Linecard HWSKus.
Work item tracking
  • Microsoft ADO (number only): 27402561

How I did it

Following Changes are made as part of this PR:

  1. On VS image, Database containers to be recreated again upon reboot. This will help change single asic VS image to multi-asic(docker_image_ctl.j2).
  2. Copy all Sup and Linecard HWSKUs directories under kvm_platform directory. Create lanemap.ini, coreportindexmap.ini, fabriclanemap.ini under each hwsku/Asic directory. Copy asic.conf file from each platform directories to their child HW_SKU directories under kvm platform directory (This will help emulate different HW-Sku's with their respective num_asic asics): (sonic-device-data/Makefile).
  3. New sonic-platform package for VS platforms. This is needed 3.1: To ensure, that database containers bring up goes through. The database containers on linecards fetch the supervisor slot_num, current_slot num etc. PS: This package will be available on pizza box vs platforms as well. However, it will be noop there, as the package expects a metadata file, which will only be available on chassis VS platforms.
  4. Changes to generate unique mac-address on VS platforms. This is done by using the device_hostname string (which will always be unique). Change to provide unique MAC address per asic on multi-asic VOQ VS platforms(sonic-cfggen, device_info.py, minigraph.py)
  5. topology.service file to be dependent on sonic.target so that as part of config load_minigraph/config reload this service gets invoked.
  6. topology.sh changes to move ports to their respective namespace (only applicable on multi-asic platforms)

How to verify it

  1. Verified that sonic-vs.img.gz gets built succesfully and single asic VS DUT comes up fine(deployed vms-kvm-t0 topology) and could see that all containers come up fine. The show commands work as expected.
  2. Verified bringup on VOQ chassis with different flavor of linecard emulation.

Which release branch to backport (provide reason below if selected)

  • [ ] 201811
  • [ ] 201911
  • [ ] 202006
  • [ ] 202012
  • [ ] 202106
  • [ ] 202111
  • [ ] 202205
  • [ ] 202211
  • [ ] 202305

Tested branch (Please provide the tested image version)

  • [ ]
  • [ ]

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

deepak-singhal0408 avatar Mar 29 '24 23:03 deepak-singhal0408

Looks like there's a new loganalyzer error message on t0:

Apr  8 07:23:49.040082 vlab-01 ERR sfputil: Failed to instantiate Chassis due to FileNotFoundError('Metadata file /etc/sonic/vs_chassis_metadata.json not found')

Is this expected? Does it need to be added to the ignore list?

saiarcot895 avatar Apr 08 '24 17:04 saiarcot895

Looks like there's a new loganalyzer error message on t0:

Apr  8 07:23:49.040082 vlab-01 ERR sfputil: Failed to instantiate Chassis due to FileNotFoundError('Metadata file /etc/sonic/vs_chassis_metadata.json not found')

Is this expected? Does it need to be added to the ignore list?

Thanks @saiarcot895 .. Added above to loganalyzer_ignore file as discussed. PR https://github.com/sonic-net/sonic-mgmt/pull/12345

deepak-singhal0408 avatar Apr 09 '24 18:04 deepak-singhal0408

/azp run

deepak-singhal0408 avatar Apr 09 '24 19:04 deepak-singhal0408

Commenter does not have sufficient privileges for PR 18512 in repo sonic-net/sonic-buildimage

azure-pipelines[bot] avatar Apr 09 '24 19:04 azure-pipelines[bot]

/azp run

judyjoseph avatar Apr 09 '24 20:04 judyjoseph

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Apr 09 '24 20:04 azure-pipelines[bot]

Commenter does not have sufficient privileges for PR 18512 in repo sonic-net/sonic-buildimage

Hi @rlhui @yxieca , it seems I dont have permission to re run the pipeline? Could you please help check and let me know if this is expected or am I missing anything here?

deepak-singhal0408 avatar Apr 09 '24 21:04 deepak-singhal0408

@deepak-singhal0408 use /azpw instead of /azp.

saiarcot895 avatar Apr 12 '24 21:04 saiarcot895

/azp run

judyjoseph avatar Apr 12 '24 22:04 judyjoseph

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Apr 12 '24 22:04 azure-pipelines[bot]

/azpw run

deepak-singhal0408 avatar Apr 13 '24 03:04 deepak-singhal0408

/AzurePipelines run

mssonicbld avatar Apr 13 '24 03:04 mssonicbld

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Apr 13 '24 03:04 azure-pipelines[bot]

@arlakshm @judyjoseph @abdosi could you please help review when get a chance? Thanks!

deepak-singhal0408 avatar Apr 16 '24 16:04 deepak-singhal0408

Thanks @arlakshm. @rlhui, could you please help merge this PR? Thanks!

deepak-singhal0408 avatar Apr 19 '24 16:04 deepak-singhal0408

@deepak-singhal0408 Cherry-pick conflict for MSFT repo 202205 branch. Please raise PR directly to MSFT repo and mention this PRin the new PR.

gechiang avatar Apr 20 '24 01:04 gechiang

@deepak-singhal0408 , seems like swss PR checks are failing with this docker change. Could you confirm all swss tests were tested with this docker vs build?

prsunny avatar Apr 24 '24 23:04 prsunny

@deepak-singhal0408 , seems like swss PR checks are failing with this docker change. Could you confirm all swss tests were tested with this docker vs build?

@prsunny , this PR is merged in sonic-buildimage on April 19th. And there is a PR merge in sonic-swss on April 22nd(https://github.com/sonic-net/sonic-swss/pull/3118). The PR checker for April 22nd change would have already taken my changes.. right? May I know why you think this PR would have caused issue?

deepak-singhal0408 avatar Apr 25 '24 06:04 deepak-singhal0408

@deepak-singhal0408 Where can I find vs_chassis_metadata.json? I built the multi-asic vs image from this repo but there is no vs_chassis_metadata.json under /etc/sonic.

ishidawataru avatar Jun 07 '24 06:06 ishidawataru

Hi, @deepak-singhal0408 , I got the same error

Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:   File "/usr/lib/python3/dist-packages/sonic_platform/chassis.py", line 26, in __init__
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:     self.metadata = self._read_metadata()
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:                     ^^^^^^^^^^^^^^^^^^^^^
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:   File "/usr/lib/python3/dist-packages/sonic_platform/chassis.py", line 34, in _read_metadata
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:     raise FileNotFoundError("Metadata file {} not found".format(self.metadata_file))
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]: FileNotFoundError: Metadata file /etc/sonic/vs_chassis_metadata.json not found

yutongzhang-microsoft avatar Jun 21 '24 03:06 yutongzhang-microsoft

Hi, @deepak-singhal0408 , I got the same error

Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:   File "/usr/lib/python3/dist-packages/sonic_platform/chassis.py", line 26, in __init__
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:     self.metadata = self._read_metadata()
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:                     ^^^^^^^^^^^^^^^^^^^^^
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:   File "/usr/lib/python3/dist-packages/sonic_platform/chassis.py", line 34, in _read_metadata
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]:     raise FileNotFoundError("Metadata file {} not found".format(self.metadata_file))
Jun 21 03:01:42 vlab-01 determine-reboot-cause[15110]: FileNotFoundError: Metadata file /etc/sonic/vs_chassis_metadata.json not found

Hi @yutongzhang-microsoft , please give a try with this Fix. https://github.com/sonic-net/sonic-host-services/pull/133 JFYI, this file is not expected to be present on pizza box vs platforms. Even on chassis based vs image, this is not mandatory to be present..

deepak-singhal0408 avatar Jun 21 '24 20:06 deepak-singhal0408

Even on chassis based vs image, this is not mandatory to be present..

How about stopping raising an exception instead of catching FileNotFoundError everywhere the exception can be ignored if the metadata file is not mandatory?

https://github.com/sonic-net/sonic-buildimage/pull/18512/files#diff-1837abe4216c07096db3b47b74a8ce47b0097622773caf92e9dc9f3c42636d06R34

ishidawataru avatar Jun 23 '24 01:06 ishidawataru

Even on chassis based vs image, this is not mandatory to be present..

How about stopping raising an exception instead of catching FileNotFoundError everywhere the exception can be ignored if the metadata file is not mandatory?

https://github.com/sonic-net/sonic-buildimage/pull/18512/files#diff-1837abe4216c07096db3b47b74a8ce47b0097622773caf92e9dc9f3c42636d06R34

I agree with you, some other commands also failed because of this error. @deepak-singhal0408 Can you fix as suggested?

yutongzhang-microsoft avatar Jun 24 '24 01:06 yutongzhang-microsoft