Nilesh M Negi

Results 11 issues of Nilesh M Negi

Unable to download data for minigo (https://github.com/mlperf/training/blob/master/reinforcement/tensorflow/minigo/ml_perf/get_data.py). Are there any alternate links? ``` Running: gsutil -m cp -r gs://minigo-pub/ml_perf/checkpoint/9 ml_perf/checkpoint Traceback (most recent call last): File "ml_perf/get_data.py", line 73, in...

https://github.com/mlperf/training_results_v0.6/tree/master/NVIDIA/benchmarks/resnet/implementations/mxnet/README.md The requirements lists MXNet 18.11-py3 NGC container whereas the Docker file uses MXNet 19.05-py3 NGC container

Trying to use the end-of-file *RESULT* statements in logs on [training_results_v0.7/NVIDIA/results/dgxa100_ngc20.06_pytorch/gnmt/](https://github.com/mlcommons/training_results_v0.7/tree/master/NVIDIA/results/dgxa100_ngc20.06_pytorch/gnmt) and [training_results_v0.7/NVIDIA/results/dgxa100_ngc20.06_pytorch/transformer/](https://github.com/mlcommons/training_results_v0.7/tree/master/NVIDIA/results/dgxa100_ngc20.06_pytorch/transformer). For gnmt: ``` $ for i in `ls NVIDIA/results/dgxa100_ngc20.06_pytorch/gnmt/result_*` ; do grep -m1 "^RESULT" $i ; done...

The wikidumps download link referenced in BERT's [README.md](https://github.com/mlperf/training_results_v0.7/tree/master/NVIDIA/benchmarks/bert/implementations/pytorch) does not exist anymore. Is there an alternative one?

## Details **Work item:** Internal **What were the changes?** Enable the use of `amdclang++` instead of `hipcc` for building RCCL. **Why were the changes made?** - Update `CXX` and `C`...

ci:extended
gfx942
gfx942-multinode

- Added Dockerfile - Updated README.md with instructions for using Dockerfile

noCI

## Details **Work item:** Internal **What were the changes?** - Modify RCCL build for Address Sanitizer (ASAN)-enabled builds to only target GPU architectures with `xnack+`. - Remove older GPU architectures...

ci:extended
gfx942

## Details **Work item:** Internal **What were the changes?** Support custom `CMAKE_PREFIX_PATH` when building MSCCLPP **Why were the changes made?** `CMAKE_PREFIX_PATH` specified for RCCL build was not being passed to...

gfx942

## Details **Work item:** Internal **What were the changes?** Update README on using RCCL with less than 8 MI300 GPUs and how to improve performance ## Approval Checklist ___Do not...

noCI

## Details **Work item:** Internal **What were the changes?** Update RCCL CHANGELOG for ROCm 6.2.x

noCI
ci:docs-only