ucx icon indicating copy to clipboard operation
ucx copied to clipboard

UCP/UCS/MEMORY/GTEST: Assign address and length if the memory was considered as HOST

Open dmitrygx opened this issue 4 years ago • 4 comments

What

  1. Assign address and length if the memory was considered as HOST.
  2. Don't use the result of memory type detection if it is not the same as memory_type got from a user.
  3. Write the test which reproduces the issue.

Why ?

This is to allow users to use UCX_TLS=ib and GPUDirectRDMA capability without specifying any MDs (e.g. CUDA/ROCm) which are able to detect memory type.

How ?

  1. Update ucs_memory_info_set_host/ucs_memory_info_set_unknown to set address/length which is passed by user.
  2. Introduce ucp_context_get_reg_memory function which check the result ucp_memory_detect_internal and if the memory types are not the same don't use its result.
  3. Run test_ucp_mmap test without GPU awareness to catch possible errors when CUDA/ROCm memory buffers are registered. It reproduces the original issue that is going to be fixed by "1" and "2".

dmitrygx avatar Dec 02 '21 09:12 dmitrygx

@bureddy @Akshay-Venkatesh could you review pls?

dmitrygx avatar Dec 03 '21 13:12 dmitrygx

We should disallow passing GPU memory when UCX_TLS explicitly disable Cuda. What if users disable memtype cache, or pass smaller buffer that would use eager/bcopy?

does it mean that we should always do memtype cache lookup (or detection) to check whether the memory type passed by the user is supported or not?

dmitrygx avatar Dec 06 '21 16:12 dmitrygx

does it mean that we should always do memtype cache lookup (or detection) to check whether the memory type passed by the user is supported or not?

I think we should do it when ENABLE_PARAMS_CHECKS is turned on (debug build) We can document "ucp_request_param_t::memory_type" that it must be one of the types returned by @ref ucp_context_attr_t.memory_types

https://github.com/openucx/ucx/blob/3e26ae3a765a15b917bc2300816e82e725eb21a8/src/ucp/api/ucp.h#L1139

yosefe avatar Dec 06 '21 20:12 yosefe

I think we should do it when ENABLE_PARAMS_CHECKS is turned on (debug build)

👍

We can document "ucp_request_param_t::memory_type" that it must be one of the types returned by @ref ucp_context_attr_t.memory_types

nice idea, a user should pass only memory types recognized by ucp_context_attr_t.memory_types thanks!

dmitrygx avatar Dec 06 '21 20:12 dmitrygx