sonic-mgmt icon indicating copy to clipboard operation
sonic-mgmt copied to clipboard

Transceiver Onboarding Test Plan

Open mihirpat1 opened this issue 2 years ago • 9 comments
trafficstars

Description of PR

The "Transceiver Onboarding Test Plan" is created to layout a framework of tests so that any new transceiver which is being onboarded to SONiC can be validated with minimal user intervention. The test plan will also help in ensuring that any new transceiver to be onboarded is at par with the existing feature support on SONiC

Summary: Fixes # (issue)

Type of change

  • [ ] Bug fix
  • [x] Testbed and Framework(new/improvement)
  • [ ] Test case(new/improvement)

Back port request

  • [ ] 201911
  • [ ] 202012
  • [ ] 202205

Approach

What is the motivation for this PR?

How did you do it?

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

mihirpat1 avatar Sep 26 '23 02:09 mihirpat1

@mihirpat1 can you add test case to ensure, during firmware download+Activiation there are no Xcvrd i2c bus access errors?

prgeor avatar Dec 27 '23 17:12 prgeor

@mihirpat1 please add one test case for checking optoe i2c error in dmesg log during firmware download

prgeor avatar Jul 25 '24 14:07 prgeor

@mihirpat1 do we have test for covering independent datapath behavior? shut/no-shut one one datapath should not impact the other. needed for breakout ports. For eg. Breakout 8x100G, we should do shut on 1st logical port and ensure no flap in port 2 to 8. Then shut on 2nd logical port and ensure no flap in port 1, 3 to 8 and so on. Finally in the reverse order, shut 8 then 7 and so on to shut 1.

prgeor avatar Jul 25 '24 15:07 prgeor

@mihirpat1 Please add a separate test case for validating VDM

@prgeor I have now added a test case to validate VDM.

mihirpat1 avatar Aug 08 '24 15:08 mihirpat1

@mihirpat1 please add one test case for checking optoe i2c error in dmesg log during firmware download

@prgeor I have added a test case for this now.

mihirpat1 avatar Aug 08 '24 15:08 mihirpat1

@mihirpat1 do we have test for covering independent datapath behavior? shut/no-shut one one datapath should not impact the other. needed for breakout ports. For eg. Breakout 8x100G, we should do shut on 1st logical port and ensure no flap in port 2 to 8. Then shut on 2nd logical port and ensure no flap in port 1, 3 to 8 and so on. Finally in the reverse order, shut 8 then 7 and so on to shut 1.

@prgeor Yes - I have added more details on this now.

mihirpat1 avatar Aug 08 '24 15:08 mihirpat1

@mihirpat1 please add a test case to ensure module by default comes up in Low power mode. One way to do that is to keep Xcvrd disabled and boot up the system.

prgeor avatar Sep 03 '24 18:09 prgeor

@mihirpat1 Following testcases are planned to be added

  1. Add testcase to abort FW download at various intervals and ensure inactive FW is 0.0.0
  2. Add testcase to execute FW download while the module is in LPM
  3. Add testcase to ensure LowPwrAllowRequestHW is set to 1 after issuing sfputil reset
  4. Add testcase to ensure transceiver is in LPM if xcvrd boot-up is disabled through pmon_daemon_control.json.
  5. Add dmesg output check for firmware upgrade. Also, modify dmesg command for platforms which do not use optoe kernel driver
  6. Make an exception for Mellanox to expect pmon to restart as part of syncd and swss restart
  7. Add a TC to modify xcvrd.py file to cause a crash and then ensure that xcvrd restarts without causing link flap
  8. Add a TC to disable Tx and ensure that the DP state is not DPActivated state within MaxDurationDPTxTurnOff time. This can be a stress test.

mihirpat1 avatar Sep 10 '24 17:09 mihirpat1

@mihirpat1 Following testcases are planned to be added

  1. Add testcase to abort FW download at various intervals and ensure inactive FW is 0.0.0
  2. Add testcase to execute FW download while the module is in LPM
  3. Add testcase to ensure LowPwrAllowRequestHW is set to 1 after issuing sfputil reset
  4. Add testcase to ensure transceiver is in LPM if xcvrd boot-up is disabled through pmon_daemon_control.json.
  5. Add dmesg output check for firmware upgrade. Also, modify dmesg command for platforms which do not use optoe kernel driver
  6. Make an exception for Mellanox to expect pmon to restart as part of syncd and swss restart
  7. Add a TC to modify xcvrd.py file to cause a crash and then ensure that xcvrd restarts without causing link flap
  8. Add a TC to disable Tx and ensure that the DP state is not DPActivated state within MaxDurationDPTxTurnOff time. This can be a stress test.

The above comments have been addressed.

mihirpat1 avatar Oct 04 '24 20:10 mihirpat1

/azp run

mssonicbld avatar Dec 31 '24 00:12 mssonicbld

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines[bot] avatar Dec 31 '24 00:12 azure-pipelines[bot]

@mihirpat1 As part of shut/no shut test, ensure LOSFlagTx{lanes} and CDRLOLFlagTx{lane} are not set after no shut is executed.

mihirpat1 avatar Jan 31 '25 20:01 mihirpat1

@mihirpat1 Add a testcase to validate auto-squelch. Following should be done When we raise RX LOL or RX LOS, we always expect OutputStatusRx to be False on the same lane

mihirpat1 avatar Feb 06 '25 05:02 mihirpat1

/azp run

mssonicbld avatar Feb 11 '25 06:02 mssonicbld

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines[bot] avatar Feb 11 '25 06:02 azure-pipelines[bot]

/azp run

mssonicbld avatar Feb 11 '25 18:02 mssonicbld

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines[bot] avatar Feb 11 '25 18:02 azure-pipelines[bot]

@mihirpat1 also capture that the PN, SN, date code etc static field are NOT expected to change across firmware upgrade

prgeor avatar Feb 11 '25 20:02 prgeor

/azp run

mssonicbld avatar Feb 11 '25 21:02 mssonicbld

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines[bot] avatar Feb 11 '25 21:02 azure-pipelines[bot]

@mihirpat1 also capture that the PN, SN, date code etc static field are NOT expected to change across firmware upgrade

@prgeor I have addressed this now.

mihirpat1 avatar Feb 11 '25 21:02 mihirpat1

/azp run

mssonicbld avatar Feb 12 '25 19:02 mssonicbld

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines[bot] avatar Feb 12 '25 19:02 azure-pipelines[bot]

/azp run

mssonicbld avatar Feb 12 '25 19:02 mssonicbld

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines[bot] avatar Feb 12 '25 19:02 azure-pipelines[bot]

/azp run

mssonicbld avatar Feb 12 '25 23:02 mssonicbld

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines[bot] avatar Feb 12 '25 23:02 azure-pipelines[bot]

@mihirpat1 Add a testcase to check the page select byte functionality for transceiver supporting page select byte functionality.

mihirpat1 avatar Feb 24 '25 19:02 mihirpat1

/azp run

mssonicbld avatar Feb 27 '25 07:02 mssonicbld

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

azure-pipelines[bot] avatar Feb 27 '25 07:02 azure-pipelines[bot]