sonic-platform-daemons icon indicating copy to clipboard operation
sonic-platform-daemons copied to clipboard

[xcvrd] DPB support on platforms with CmisManagerTask enabled

Open ishidawataru opened this issue 1 year ago • 7 comments

Description

This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's port_dict must be updated according to the port add/remove events. This commit removes the port_mapping field from CmisManagerTask as port_mapping was mostly used just for storing asic_id information and that can be simply done by port_dict instead.

Added a helper method get_asic_id() method to CmisManagerTask for getting asic_id from logical_port.

Motivation and Context

fixes https://github.com/sonic-net/sonic-buildimage/issues/18893

How Has This Been Tested?

I tested on the VS environment.

Additional Information (Optional)

ishidawataru avatar Jun 14 '24 02:06 ishidawataru

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: ishidawataru / name: Wataru Ishida (7639be71212c69fb807f1f8d1434d0017d5e311a, 824d0594074b3c0a1ffd435ee4ee863d08b25e82, 33566ec63496856acb83a54bd83217748b752aea, 95ae842f65441d7c713c40d68f71a9139322160e)

@ishidawataru Can you please fix the code coverage check failure?

mihirpat1 avatar Jun 18 '24 18:06 mihirpat1

@mihirpat1 I'm still communicating with the original reporter of this issue to check the fix actually works. After that I'll fix the code coverage failure.

ishidawataru avatar Jun 19 '24 10:06 ishidawataru

@mihirpat1 Confirmed that the fix resolved the issue. Also, I fixed the coverage test failure. Please review.

ishidawataru avatar Jun 25 '24 09:06 ishidawataru

@prgeor This PR fixes https://github.com/sonic-net/sonic-buildimage/issues/18893. It is not about the application selection. The current CmisManagerTask implementation doesn't delete the port info from port_dict when ports are deleted due to DPB configuration, which causes a crash.

ishidawataru avatar Jul 06 '24 01:07 ishidawataru

@longhuan-cisco @mihirpat1 I updated the PR based on @longhuan-cisco 's suggestion. Please review.

I used force-push and removed the original commits. I backed up the original branch here.

ishidawataru avatar Sep 17 '24 02:09 ishidawataru

@prgeor Hi, just a gentle ping to check if I need to do something to get this PR merged.

ishidawataru avatar Oct 29 '24 06:10 ishidawataru

/azp run

mssonicbld avatar Mar 15 '25 01:03 mssonicbld

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Mar 15 '25 01:03 azure-pipelines[bot]

@prgeor I fixed the conflicts. Please take a look.

ishidawataru avatar Mar 15 '25 01:03 ishidawataru

@ishidawataru will you be able to test this on a real hardware with breakout enabled and do insertion and removal of the optics?

prgeor avatar Mar 18 '25 05:03 prgeor

@prgeor Sorry, I don’t have access to real hardware to test this PR. I performed insertion and removal testing using the xcvr emulator in last October. I can redo those tests and provide the logs if needed.

ishidawataru avatar Mar 20 '25 04:03 ishidawataru

@prgeor

I have done OIR tests with the latest diff.

Tried below sequence on 400G AOC:

  1. Move from non-breakout mode (1x400G) to 4x100G breakout mode via DPB CLI
  2. Perform OIR
  3. Move back to non-breakout mode (1x400G) via DPB CLI
  4. Perform OIR

No xcvrd crash. link comes up fine with module ready state. show interface CLI shows correct interface entries.

longhuan-cisco avatar Mar 26 '25 22:03 longhuan-cisco

@ishidawataru @prgeor Can you please backport this change to 202411?

Keshavg-marvell avatar May 28 '25 11:05 Keshavg-marvell