gdrcopy
gdrcopy copied to clipboard
Fails in ioctl call in gdr_pin_buffer. Perhaps the GDRDRV_IOC_PIN_BUFFER flags are incorrect.
-bash-4.2$ ./validate buffer size: 327680 device ptr: 7fffa0600000 gdr open: 0xc9abf0 before ioctl GDRDRV IOC PIN BUFFER c020da01 After ioctl retcode -1 -bash-4.2$
-bash-4.2$ ./copybw GPU id:0 name:Tesla V100-SXM2-32GB PCI domain: 0 bus: 26 device: 0 GPU id:1 name:Tesla V100-SXM2-32GB PCI domain: 0 bus: 28 device: 0 GPU id:2 name:Tesla V100-SXM2-32GB PCI domain: 0 bus: 136 device: 0 GPU id:3 name:Tesla V100-SXM2-32GB PCI domain: 0 bus: 138 device: 0 selecting device 0 testing size: 131072 rounded size: 131072 device ptr: 7fffa0600000 before ioctl GDRDRV IOC PIN BUFFER c020da01 After ioctl size -1 closing gdrdrv -bash-4.2$
@immaraj could you please attach the output of the commands below? uname -r cat /etc/os-release dmesg (after installation of gdrcopy and execution of validate) nvidia-smi dmesg cat /proc/driver/nvidia/version cat /proc/driver/nvidia/params
@immaraj could you please close this bug if it does not reproduce anymore? thanks
-bash-4.2$ uname -r 3.10.0-514.44.1.el7.x86_64 -bash-4.2$ cat /etc/os-release NAME="Red Hat Enterprise Linux Server" VERSION="7.3 (Maipo)" ID="rhel" ID_LIKE="fedora" VERSION_ID="7.3" PRETTY_NAME="Red Hat Enterprise Linux Server 7.3 (Maipo)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:7.3:GA:server" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7" REDHAT_BUGZILLA_PRODUCT_VERSION=7.3 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="7.3" -bash-4.2$ dmesg (after installation of gdrcopy and execution of validate) -bash: syntax error near unexpected token `after' -bash-4.2$ nvidia-smi ^C^C -bash-4.2$ uname -r 3.10.0-514.44.1.el7.x86_64 -bash-4.2$ nvidia-smi Sat Jun 22 18:17:57 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla V100-SXM2... Off | 00000000:1A:00.0 Off | 0 | | N/A 34C P0 55W / 300W | 0MiB / 32480MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla V100-SXM2... Off | 00000000:1C:00.0 Off | 0 | | N/A 34C P0 55W / 300W | 0MiB / 32480MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 Tesla V100-SXM2... Off | 00000000:88:00.0 Off | 0 | | N/A 35C P0 56W / 300W | 0MiB / 32480MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 Tesla V100-SXM2... Off | 00000000:8A:00.0 Off | 0 | | N/A 34C P0 55W / 300W | 0MiB / 32480MiB | 6% Default | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ -bash-4.2$ cat /etc/os-release NAME="Red Hat Enterprise Linux Server" VERSION="7.3 (Maipo)" ID="rhel" ID_LIKE="fedora" VERSION_ID="7.3" PRETTY_NAME="Red Hat Enterprise Linux Server 7.3 (Maipo)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:7.3:GA:server" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7" REDHAT_BUGZILLA_PRODUCT_VERSION=7.3 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="7.3" -bash-4.2$ cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 410.79 Thu Nov 15 10:41:04 CST 2018 GCC version: gcc version 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) -bash-4.2$ cat /proc/driver/nvidia/params Mobile: 4294967295 ResmanDebugLevel: 4294967295 RmLogonRC: 1 ModifyDeviceFiles: 1 DeviceFileUID: 0 DeviceFileGID: 0 DeviceFileMode: 438 UpdateMemoryTypes: 4294967295 InitializeSystemMemoryAllocations: 1 UsePageAttributeTable: 4294967295 EnableMSI: 1 MapRegistersEarly: 0 RegisterForACPIEvents: 1 CheckPCIConfigSpace: 1 EnablePCIeGen3: 0 MemoryPoolSize: 0 KMallocHeapMaxSize: 0 VMallocHeapMaxSize: 0 IgnoreMMIOCheck: 0 TCEBypassMode: 0 UseThreadedInterrupts: 1 EnableStreamMemOPs: 0 EnableBacklightHandler: 0 EnableUserNUMAManagement: 1 RegistryDwords: "" RegistryDwordsPerDevice: "" RmMsg: "" AssignGpus: "" GpuBlacklist: "" -bash-4.2$
@immaraj Which gdrcopy version are you using? Is that v1.3 or did you clone it from the master branch?
@immaraj It would be great if you try again with v2.1 and close this issue if it works. Thanks