rocm-pytorch-gfx803-docker
rocm-pytorch-gfx803-docker copied to clipboard
Permission problems?
``Hi,
After trying to follow the steps to use your docker i have, so far, havent found a solution on how to use rocmiinfo (or anything that accesses rocm anyway) through the docker.
Currently i try the following:
podman run -it --device=/dev/kfd --device=/dev/dri --net=host --group-add=video --ipc=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $HOME/sddocker:/sddocker localhost/rocm-pytorch-gfx803
And then trying this:
(environ) sduser@HAL:~$ rocminfo ROCk module is loaded Unable to open /dev/kfd read-write: Permission denied root is not member of "nogroup" group, the default DRM access group. Users must be a member of the "nogroup" group or another DRM access group in order for ROCm applications to run successfully.
Which kind of... surprises me that is raises "root" rather than sduser.
Any ideas how to solve this?
The podman container never will have more permissions than the user that is running it, so I guess that your user outside the container isn't in video or render group referring to ROCm docs. If it is, then I have no idea really. Also, send your system info, so I know what we are working with.
Hi, thanks for your answer.. First of all my user outside of the docker:
martin@HAL:~$ groups martin adm cdrom sudo dip video plugdev render lpadmin lxd sambashare docker
My System is a Ubuntu 22.04:
martin@HAL:~$ uname -a Linux HAL 5.19.0-40-generic #41~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 31 16:00:14 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Rocminfo on the machine outside docker:
` martin@HAL:~$ rocminfo ROCk module is loaded
HSA System Attributes
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
Agent 1
Name: AMD Ryzen 5 2600 Six-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 5 2600 Six-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3400
BDFID: 0
Internal Node ID: 0
Compute Unit: 12
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 32792656(0x1f46050) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 32792656(0x1f46050) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 32792656(0x1f46050) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
Agent 2
Name: gfx803
Uuid: GPU-XX
Marketing Name: Radeon RX 580 Series
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 26591(0x67df)
ASIC Revision: 1(0x1)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1366
BDFID: 1792
Internal Node ID: 1
Compute Unit: 36
SIMDs per CU: 4
Shader Engines: 4
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 8388608(0x800000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx803
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
`
and now for the devices:
martin@HAL:~$ ls /dev/ -l | grep kfd crw-rw---- 1 root render 511, 0 Apr 24 22:18 kfd martin@HAL:~$ ls /dev/dri/ -l drwxr-xr-x 2 root root 80 Apr 25 15:07 by-path crw-rw----+ 1 root video 226, 0 Apr 25 15:07 card0 crw-rw----+ 1 root render 226, 128 Apr 24 22:18 renderD128
Maybe this helps a bit?
Try adding --group-add nogroup to run parameters, maybe it will help.
` martin@HAL:~/Projekte/rocm-pytorch-gfx803-docker$ podman run -it --device=/dev/kfd --device=/dev/dri --net=host --group-add=video --group-add=nogroup --ipc=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $HOME/sddocker:/sddocker localhost/rocm-pytorch-gfx803
(environ) sduser@HAL:~$ rocminfo ROCk module is loaded Unable to open /dev/kfd read-write: Permission denied root is not member of "nogroup" group, the default DRM access group. Users must be a member of the "nogroup" group or another DRM access group in order for ROCm applications to run successfully. `
That didnt help ;)
:skull:
So to solve this issue, i have made an updated rootless version. Still bloated, but working. And as a demo it runs the stable diffusion webui.
#5
PS.: the issue above is the podman relative uid gid, which differs in the container. So that needs to be mapped first. It's a pain in the bottom for sure, hence came the rootless idea to get around this issue
Oh, I always used this container in the rootless mode, as my podman is installed that way. Making it explicitly rootless is a good idea.
Indeed, i was thinking about creating two subversion docker and rootless inside the Dockerfile.
PS.: While i was testing this i was able to create 6 images successfully, however since then i am struggling with hardware failure, which causing opencl/rocm kernel error even if its just a clinfo/rocminfo. that part or my rx580 died. Now i had to order gpu
That's sad to hear, probably the warranty is expired too. Well, I will probably merge your PR in a few hours to days max. Btw. you will order new RX 580 or more recent hardware?
My new saphire rx6600 8GB just got delivered( sadly i am still working). $220 on stock clearance before the new 7600 get in stock.
Rx6600 codename is gfx1032 and using with rocm is ok only possible if i force gfx1030/1031 to be recognised for pytorch.
Nice