rccl icon indicating copy to clipboard operation
rccl copied to clipboard

Added new unit test for register.cc

Open deeksha-amd opened this issue 7 months ago • 6 comments

Details

Work item: Internal

What were the changes?
This PR introduces one register test: GraphRegisterDeregister: To verify buffer registration and deregistration functions (ncclCommGraphRegister and ncclCommGraphDeregister) work correctly across all available GPUs

Why were the changes made? The new test improves the code coverage for register.cc file.

Approval Checklist

Do not approve until these items are satisfied.

  • [ ] Verify the CHANGELOG has been updated, if
    • there are any NCCL API version changes,
    • any changes impact library users, and/or
    • any changes impact any other ROCm library.

deeksha-amd avatar May 27 '25 12:05 deeksha-amd

CI failure at linking?

thananon avatar May 28 '25 20:05 thananon

Hi @thananon This pull request depends on the following PR: https://github.com/ROCm/rccl/pull/1664

Once the above PR is merged, this one should pass CI and no longer fail. Let me know if you have any further questions.

deeksha-amd avatar Jun 04 '25 16:06 deeksha-amd

/azp run

nileshnegi avatar Jun 27 '25 19:06 nileshnegi

Azure Pipelines successfully started running 2 pipeline(s).

azure-pipelines[bot] avatar Jun 27 '25 19:06 azure-pipelines[bot]

We have one test GraphRegisterDeregister. It invokes ncclCommGraphRegister and ncclCommGraphDeregister and checks the return value. Do we need to check the actual effect of these function calls, besides checking the return value?

Do we need tests for other functions in the register.cc file? ncclRegFind ncclRegLocalIsValid ncclRegister regCleanup ncclRegCleanup ncclCommRegister_impl commDeregister ncclCommDeregister_impl

venksubramd avatar Jul 11 '25 12:07 venksubramd

@venksubramd

  1. Partially yes, it's more of an API sanity check and tests the intended lifecycle: allocate -> register -> deregister ->free
  2. The new test added improves the code coverage from 90%(existing) to 100% for register.cc file. So, all the other functions are already covered.

deeksha-amd avatar Aug 12 '25 12:08 deeksha-amd