Yanfei Guo

Results 21 issues of Yanfei Guo

## Pull Request Description 1. Add interface for querying GPU device list and subdevice list in MPL. The MPL returns array of integers that represents individual GPU device or subdevice....

## Pull Request Description The script will run recursively if no command line option is provides. Delaying the check remove a confusing warning message. ## Author Checklist * [ ]...

We could do something similar like SLURM did with CUDA https://slurm.schedmd.com/gres.html#GPU_Management. Also need to investigate the assignment approach for AMD and Intel GPUs.

## Pull Request Description The goal of this PR is to use GenQ for pack buffer allocation. This avoids the costly allocation of GPU registered host buffer on the fly...

The following failure is presented consistently for CH4-UCX build. The test is marked as xfail now. ```not ok 558 - ./datatype/darray_pack 72 --- Directory: ./datatype File: darray_pack Num-procs: 72 Timeout:...

## Pull Request Description The pop on CPPFLAGS will cleanup the HIP related flags. This PR depends on fix at https://github.com/pmodels/yaksa/pull/231. ## Author Checklist * [ ] **Provide Description** Particularly...

## Pull Request Description Receiver side free cell allocation does not working due to: 1. MPMC dequeue not exit on success 2. trying to using global rank of receiver at...

## Pull Request Description ## Checklist * [ ] Reference appropriate issues (with "Fixes" or "See" as appropriate) * [ ] Commits are self-contained and do not do two things...

We need to investigate and study the best strategy for performance tuning in the CUDA backend. One knob is the thread block size vs number of blocks.

## Pull Request Description This PR adds the support of detecting node topology and runtime selection of regular/stream memcpy. The PR has four parts: 1. Fixing existing MPMC queue and...