DCGM
DCGM copied to clipboard
Add support for IPv6
Fixes #150
I've added IPv6 support to Hostengine and dcgmi CLI while not changing/breaking any existing functionality.
Hostengine now supports binding to an IPv6 address
Start hostnegine:
$ nv-hostengine -b [::1] --log-level debug
Started host engine version 3.3.6 using port number: 5555
Confirm using lsof:
$ sudo lsof -i :5555
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nv-hosten 2004760 root 47u IPv6 3165553609 0t0 TCP localhost:personal-agent (LISTEN)
Connect using dcgmi without port:
$ dcgmi discovery -l --host [::1]
8 GPUs found.
+--------+----------------------------------------------------------------------+
| GPU ID | Device Information |
+--------+----------------------------------------------------------------------+
| 0 | Name: NVIDIA PG509-210 |
| | PCI Bus ID: 00000000:04:00.0 |
| | Device UUID: GPU-de6a7a6a-776e-e6e2-bd3d-d8114ccf6db2 |
Connect using dcgmi with port:
$ dcgmi discovery -l --host [::1]:5555
8 GPUs found.
+--------+----------------------------------------------------------------------+
| GPU ID | Device Information |
+--------+----------------------------------------------------------------------+
| 0 | Name: NVIDIA PG509-210 |
| | PCI Bus ID: 00000000:04:00.0 |
| | Device UUID: GPU-de6a7a6a-776e-e6e2-bd3d-d8114ccf6db2 |
Hostengine now supports both IPv4 and IPv6 connections
Start hostnegine:
$ nv-hostengine -b ALL --log-level debug
Started host engine version 3.3.6 using port number: 5555
Confirm using lsof:
$ sudo lsof -i :5555
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nv-hosten 2478925 root 47u IPv6 3173352315 0t0 TCP *:personal-agent (LISTEN)
Connect using dcgmi on IPv4:
$ dcgmi discovery -l
8 GPUs found.
+--------+----------------------------------------------------------------------+
| GPU ID | Device Information |
+--------+----------------------------------------------------------------------+
| 0 | Name: NVIDIA PG509-210 |
| | PCI Bus ID: 00000000:04:00.0 |
| | Device UUID: GPU-de6a7a6a-776e-e6e2-bd3d-d8114ccf6db2 |
Connect using dcgmi on IPv6:
$ dcgmi discovery -l --host [::1]
8 GPUs found.
+--------+----------------------------------------------------------------------+
| GPU ID | Device Information |
+--------+----------------------------------------------------------------------+
| 0 | Name: NVIDIA PG509-210 |
| | PCI Bus ID: 00000000:04:00.0 |
| | Device UUID: GPU-de6a7a6a-776e-e6e2-bd3d-d8114ccf6db2 |
Hostengine default IPv4 functionality is not broken
Start hostnegine:
$ nv-hostengine --log-level debug
Started host engine version 3.3.6 using port number: 5555
Confirm using lsof:
$ sudo lsof -i :5555
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nv-hosten 2476116 root 47u IPv4 3173251779 0t0 TCP localhost:personal-agent (LISTEN)
Connect using dcgmi on IPv4:
$ dcgmi discovery -l
8 GPUs found.
+--------+----------------------------------------------------------------------+
| GPU ID | Device Information |
+--------+----------------------------------------------------------------------+
| 0 | Name: NVIDIA PG509-210 |
| | PCI Bus ID: 00000000:04:00.0 |
| | Device UUID: GPU-de6a7a6a-776e-e6e2-bd3d-d8114ccf6db2 |
Connect using dcgmi on IPv6 (expected failure):
$ dcgmi discovery -l --host [::1]
Error: unable to establish a connection to the specified host: [::1]
Error: Unable to connect to host engine. Host engine connection invalid/disconnected.