Akshay-Venkatesh

Results 25 issues of Akshay-Venkatesh

## What In preparation for https://github.com/openucx/ucx/pull/7847 being broken into separate PRs, introduce md_query_v2 in this PR.

## What Add dmabuf fd field in md_mem attributes ## Why ? Needed by UCT/IB to register device memory exposed as a dmabuf

API

## What Use entry position of given device in `/sys/bus/pci/devices` instead of device iteration count as seen by topo sys on the given process ## Why ? Hopefully this ensures...

## What Detect if remote/local memory types for perf estimate is of type cuda/cuda-managed. If so, report peak device memory bandwidth ## Why ? Preparation for device staging pipeline protocols....

## What Removes uct/cuda dependency on cuda runtime ## Why ? - generally a minimum cuda driver version covers all functionality that cuda_runtime provides so additional dependency not needed ###...

Seeing the following compiler warnings with nvidia hpc sdk 22.2 available here https://developer.nvidia.com/nvidia-hpc-sdk-releases ## Background information ### What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch...

Target: v4.1.x

### Describe the bug The full error message is: ``` $ # UCX at master $ mpirun -np 2 --npernode 2 --mca btl ^openib,smcuda --mca pml ucx --mca pml_ucx_devices any...

Bug

## What Set default ratio to 1.0 which means that cuda pinned allocations of any size will be registered fully by IB. ## Why ? Pinned device memory is not...

## Why ? Use native capabilities in rcache to limit the size of mappings that can be cached by cuda_ipc transport. For example, `UCX_CUDA_IPC_RCACHE_MAX_REGIONS=10 UCX_CUDA_IPC_RCACHE_MAX_SIZE=1mb` limits the maximum number of...

## What/Why? Allow a single UCP context to handle multiple CUDA devices for cuda_copy transport. This enables use cases under Legion/Realm, OpenACC, and MPI workloads that prefer 1:N process-to-GPU mapping...