ior Support GPU direct as memory endpoint

Would be useful to support benchmarking deep learning workflows. Could use a new flag such as "--memory-buffer-gpu".

Nov 30 '20 14:11 JulianKunkel

I believe John Ravi/NC State has code to do this that he is cleaning up. @sbyna may know the status of it.

Nov 30 '20 17:11 glennklockwood

John is doing some fine tuning to the code, which allows IOR to work both with and without GDS. He has been busy with another project, but should be done with ior shortly and do a PR.

Dec 01 '20 19:12 sbyna

I've been implementing the CUDA malloc/free part allowing buffers to live in the GPU. Fun result on a v100: ./src/ior -o /dev/shm/test write 1479 read 3610

./src/ior -O allocateBufferOnGPU=1 -o /dev/shm/test write 2122 read 5236

I ran it several times, reproducible.

Jan 21 '21 21:01 JulianKunkel

Does -O allocateBufferOnGPU=1 also cause IOR to use O_DIRECT? There is also a separate question of whether IOR is filling in the data pattern in the buffers in this case, or is the CPU touching the pages?

What filesystem (and if Lustre, what version)? There are Lustre-specific GDS enhancements in some Lustre versions, but it is not yet landed into the master branch because of 2.14 feature freeze.

Jan 22 '21 01:01 adilger

The option is purely to allocate the buffer on GPU, IOR is filling the buffer as usual but thanks to unified memory the pages are migrated back to the GPU. My goal is to allow different combinations of options. Users can decide to additionally use O-Direct. I'll be adding the additional feature to use GPU Direct but I have no test system that could benefit from it.

Jan 22 '21 12:01 JulianKunkel

I have added support for gpuDirect via the cuFile API. $ ./src/ior --gpuDirect --posix.odirect It basically stores one block of the file in the read or write buffer on the GPU.

Unfortunately, I have no system where I can sensibly test it with a file system such as Lustre. Therefore, I had created a library to fake the GPUDirect stuff and tested it with NVIDIA examples: https://github.com/VI4IO/fake-gpudirect-cufile/

That means it is likely that something may not be working completely as intended and it requires testing. I'm happy to finish development on a system that has support or have someone else come back with inquiries. Once that works, will add the options to md* benchmarks.

Jan 25 '21 18:01 JulianKunkel

I did test this patch on DGX-A100 with GDS(GPU DIRECT Storage) enabled. There are a couple of feedback. it would allow "--with-cuda=PATH" instead of CPPFLAGS="-I/usr/local/cuda-11/include" LDFLAGS="-L/usr/local/cuda-11/lib64" for custom CUDA path for even more simpler. '-O allocateBufferOnGPU=1' works, but I did't find '--gpuDirect' option. it would be nice to have GPU affinity setting? e.g. e.g. if node has multiple GPUs, IOR would select GPU id to use.

Feb 01 '21 06:02 sihara

Thanks for testing.

It seems it may not have detected cufile.h When you run configure, it needs to output: checking for cufile.h... yes checking for library containing cuFileDriverOpen... -lcufile

Once that this is there, it should support gpuDirect: $ ./src/ior --help Synopsis ./src/ior Flags -c, --collective Use collective I/O -C reorderTasks -- changes task ordering for readback (useful to avoid client cache) --gpuDirect allocate I/O buffers on the GPU and use gpuDirect to store data; this option is incompatible with any option requiring CPU access to data. ... Module POSIX Flags --posix.odirect Direct I/O Mode --gpuDirect allocate I/O buffers on the GPU

Feb 01 '21 09:02 JulianKunkel

I've added support for path, i.e., --with-cuda=<PATH> and --with-gpuDirect=<PATH> should work.

Feb 01 '21 10:02 JulianKunkel

@JulianKunkel where did you push codes I can test again?

Feb 10 '21 08:02 sihara

Hi, it is in the same PR #323. You can chat with me on VI4IO if there is any issue.

Feb 10 '21 08:02 JulianKunkel

Since the basic version has landed (but couldn't be too well tested), I'll close the issue for now.

Feb 25 '21 18:02 JulianKunkel

Hi, I am testing I/O systems with support for GPUDirect in an heterogeneous cluster.

In this cluster I have 7 different nodes with Tesla T4 GPUs with GPUDirect support and NVMeoF. I would like to test the access of several nodes on the same NVMe disk by all the nodes in order to test the Bandwidth.

I have tried to use the flags --gpuDirect, --with-cuda and --with-gpuDirect but I have the following error while executing the command: Error invalid argument: --gpuDirect Error invalid argument: --with-cuda Error invalid argument: --with-gpuDirect Invalid options

The command was mpirun -n 2 -H nodo51:1,nodo52:1 ./src/ior -t 1m -b 16m -s 16 -o /nvme/testFile -O allocateBufferOnGPU=1 --gpuDirect --with-cuda=/usr/local/cuda-11.8/bin --with-gpuDirect=/usr/local/cuda-11.8/bin

If anyone can help me, I would appreciate it.

Mar 08 '23 16:03 ajtarraga

Great, in order to use these flags, it must find CUDA (./configure options: --with-cuda --with-gpuDirect) Once compiled, you shall see the options listed when running ./ior --help

Mar 08 '23 16:03 JulianKunkel

I have configured with these flags, however, I am not able to see the features of --gpuDirect while executing the command. What command should I use in order to test it with GPU and GPUDirect? @JulianKunkel

Mar 08 '23 16:03 ajtarraga

Could it be that nvcc is not in the path? My configure looks like this: ./configure --with-gpuDirect=/usr/local/cuda/targets/x86_64-linux --with-nvcc --with-cuda=/usr/local/cuda/targets/x86_64-linux You need to check the output, if it has: checking for cufile.h checking for cuda_runtime.h

Mar 09 '23 07:03 JulianKunkel

I have configured like you said: ./configure --with-gpuDirect=/usr/local/cuda-11.8/targets/x86_64-linux --with-nvcc --with-cuda=/usr/local/cuda-11.8/targets/x86_64-linux And I have the correct output that you indicated: checking for cufile.h... yes checking for cuda_runtime.h... yes

After making make clean and make commands, I tried to search for gpuDirect command and I can't see it, I can only see: -O allocateBufferOnGPU=X -- allocate I/O buffers on the GPU: X=1 uses managed memory - verifications are run on CPU; X=2 managed memory - verifications on GPU; X=3 device memory with verifications on GPU

Mar 09 '23 17:03 ajtarraga

Okay, so I did git pull to the latest version dbb1f7d00023ca00. It is challenging to find the issue on your end. Can you share the config.h, I added the output of: $ grep -v "/" src/config.h | grep -v "^$"

#define HAVE_CUDA_RUNTIME_H 1
#define HAVE_CUFILE_H 1
#define HAVE_FCNTL_H 1
#define HAVE_GETTIMEOFDAY 1
#define HAVE_INTTYPES_H 1
#define HAVE_LIBINTL_H 1
#define HAVE_MEMORY_H 1
#define HAVE_MEMSET 1
#define HAVE_MKDIR 1
#define HAVE_MPI 1
#define HAVE_PUTENV 1
#define HAVE_REALPATH 1
#define HAVE_REGCOMP 1
#define HAVE_STATFS 1
#define HAVE_STATVFS 1
#define HAVE_STDINT_H 1
#define HAVE_STDLIB_H 1
#define HAVE_STRCASECMP 1
#define HAVE_STRCHR 1
#define HAVE_STRERROR 1
#define HAVE_STRINGS_H 1
#define HAVE_STRING_H 1
#define HAVE_STRNCASECMP 1
#define HAVE_STRSTR 1
#define HAVE_SYSCONF 1
#define HAVE_SYS_IOCTL_H 1
#define HAVE_SYS_MOUNT_H 1
#define HAVE_SYS_PARAM_H 1
#define HAVE_SYS_STATFS_H 1
#define HAVE_SYS_STATVFS_H 1
#define HAVE_SYS_STAT_H 1
#define HAVE_SYS_TIME_H 1
#define HAVE_SYS_TYPES_H 1
#define HAVE_UNAME 1
#define HAVE_UNISTD_H 1
#define HAVE_WCHAR_H 1
#define META_ALIAS "ior-4.1.0+dev-0"
#define META_NAME "ior"
#define META_RELEASE "0"
#define META_VERSION "4.1.0+dev"
#define PACKAGE_BUGREPORT ""
#define PACKAGE_NAME "ior"
#define PACKAGE_STRING "ior 4.1.0+dev"
#define PACKAGE_TARNAME "ior"
#define PACKAGE_URL ""
#define PACKAGE_VERSION "4.1.0+dev"
#define STDC_HEADERS 1
#ifndef _DARWIN_USE_64_BIT_INODE
# define _DARWIN_USE_64_BIT_INODE 1
#endif
#define _XOPEN_SOURCE 700

Mar 09 '23 18:03 JulianKunkel

I have the same output as you and I have checked it with diff command and there is no difference between your config and mine

Mar 14 '23 18:03 ajtarraga

I can now reproduce the issue.

Mar 15 '23 07:03 JulianKunkel

The problem appears to be that nvcc cannot be found: checking for nvcc... no

If it works and you get sth. like this: checking for nvcc... /sw/tools/cuda/11.2/bin/nvcc

Then during compilation, it will output sth like: nvcc -g -O2 -c -o utilities-gpu.o utilities-gpu.cu

Only then GPU Direct works. If have to see why the configure.ac macro doesn't work as intended in this case.

Give it a try pls.

Mar 15 '23 08:03 JulianKunkel

I have tried it with: sudo ./configure --with-gpuDirect=/usr/local/cuda-11.8/targets/x86_64-linux --with-nvcc=/usr/local/cuda-11.8/bin/nvcc --with-cuda=/usr/local/cuda-11.8/targets/x86_64-linux

And I got the output like you said: checking for nvcc... /usr/local/cuda-11.8/bin/nvcc

However, when I have used make command, I have no output like you said. I tried to search about key word nvcc and there isn't matching possibilities.

I cannot understand what is the problem on that

Mar 15 '23 08:03 ajtarraga

ior ior copied to clipboard

Support GPU direct as memory endpoint

ior
ior copied to clipboard