ROC_SHMEM icon indicating copy to clipboard operation
ROC_SHMEM copied to clipboard

ROC_SHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.

Results 33 ROC_SHMEM issues
Sort by recently updated
recently updated
newest added

* imports dlmalloc (latest version MIT licensed) * create an encapsulation class DLMalloc that exposes only relevant functionalities, to prevent using non-static/templated members of parent class we use the mspace...

- Added `--map-by numa` flag - Added `--timeout` flag - Added environment variable to enable/disable get tests

⚠️ WIP-DNM!

Tyical output looks like ```salloc -whostname -N1 -n2 --gpus-per-task=1 -c6 ../rocSHMEM/scripts/functional_tests/driver.sh tests/functional_tests/rocshmem_example_driver rma logs mpirun -n 2 -mca pml ucx -x ROCSHMEM_MAX_NUM_CONTEXTS=1 tests/functional_tests/rocshmem_example_driver -a 2 -w 1 -z 1 -s...

This bug fix is in develop (#31) but it has not been incorporated into ROCm 6.4.x

⚠️ WIP-DNM!

## Motivation Add experimental support for gfx1201 architectures (Radeon RX 9070 and 9070XT, Radeon AI PRO 9700) ## Technical Details Initial experiments with gfx12 based ISAs ## Test Result After...

## Motivation 1. Remove unused code 2. Enable removing switch from critical path in memcpy_lane/wave/wg? ## Technical Details Remove LOAD-STORE macros and replace usage of STORE macro with its definition...

## Motivation Increase the maximum message rate by using all enabled threads in the wave for polling completions. ## Technical Details Use all available threads for polling the cq to...

### Suggestion Description ## Description Since overal strategy of ROCM library in past, developers need to have a closer look at NVIDIA nvshem implementation (shareing the software development fundations). Currently...

Feature Request
status: triage

## Motivation Remade cherrypicks after accidental force push

## Motivation Users want a tool compare performance between version X and version Y of our code. ## Design A python matplotlib script that can be used to compare the...

noCI