john icon indicating copy to clipboard operation
john copied to clipboard

AMD rocm opencl results (failing self tests)

Open kochd opened this issue 6 years ago • 31 comments

I just wanted to report the current state of self tests with AMD's rocm (https://rocm.github.io/) as nobody reported them by now.

System configuration

Attach details about your OS and about JtR, including:

  • $ ./john --list=build-info.
Version: 1.8.0.13-jumbo-1-bleeding-7d4dac26f0 2018-12-19 21:07:01 +0000
Build: linux-gnu 64-bit x86_64 AVX2 AC OMP
SIMD: AVX2, interleaving: MD4:3 MD5:3 SHA1:1 SHA256:1 SHA512:1
CPU tests: AVX2
$JOHN is ./
Format interface version: 14
Max. number of reported tunable costs: 4
Rec file version: REC4
Charset file version: CHR3
CHARSET_MIN: 1 (0x01)
CHARSET_MAX: 255 (0xff)
CHARSET_LENGTH: 24
SALT_HASH_SIZE: 1048576
SINGLE_IDX_MAX: 2147483648
SINGLE_BUF_MAX: 4294967295
Single words effective limit: Number of salts vs. SingleMaxBufferSize in john.conf
Max. Markov mode level: 400
Max. Markov mode password length: 30
gcc version: 8.2.0
GNU libc version: 2.27 (loaded: 2.27)
OpenCL headers version: 2.2
Crypto library: OpenSSL
OpenSSL library version: 01010101f
OpenSSL 1.1.1a  20 Nov 2018
GMP library version: 6.1.2
File locking: fcntl()
fseek(): fseek
ftell(): ftell
fopen(): fopen
memmem(): System's
  • $ ./john --list=opencl-devices (if applicable).
Platform #0 name: AMD Accelerated Parallel Processing, version: OpenCL 2.1 AMD-APP (2783.0)
    Device #0 (0) name:     gfx803
    Board name:             Ellesmere [Radeon RX 470/480]
    Device vendor:          Advanced Micro Devices, Inc.
    Device type:            GPU (LE)
    Device version:         OpenCL 1.2 
    Driver version:         2783.0 (HSA1.1,LC) - Crimson  
    Native vector widths:   char 4, short 2, int 1, long 1
    Preferred vector width: char 4, short 2, int 1, long 1
    Global Memory:          8 GB
    Global Memory Cache:    16 KB
    Local Memory:           64 KB (Local)
    Constant Buffer size:   6.8 GB
    Max memory alloc. size: 6.8 GB
    Max clock (MHz):        1580
    Profiling timer res.:   1 ns
    Max Work Group Size:    256
    Parallel compute cores: 36
    Stream processors:      2304  (36 x 64)
    Speed index:            3640320
    SIMD width:             16
    Wavefront width:        64
    PCI device topology:    0b:00.0
  • john --test=0 --format=opencl
Testing: KeePass-opencl [SHA256 AES/Twofish/ChaCha OpenCL]... FAILED (cmp_all(7))
Testing: rar-opencl, RAR3 (length 5) [SHA1 OpenCL AES]... (4xOMP) FAILED (cmp_all(1))
Testing: pgpdisk-opencl [SHA1 AES/TwoFish/CAST OpenCL]... FAILED (cmp_all(1))
Testing: 7z-opencl, 7-Zip (512K iterations) [SHA256 AES OpenCL]... (4xOMP) FAILED (cmp_all(1))
4 out of 84 tests have FAILED

There are test which result in PASS but have warnings:

Testing: mscash-opencl, M$ Cache Hash [MD4 OpenCL]... /tmp/AMD_12531_545/t_12531_547.cl:449:18: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
Testing: mscash2-opencl, MS Cache Hash 2 (DCC2) [PBKDF2-SHA1 OpenCL]... /tmp/AMD_12531_562/t_12531_564.cl:611:17: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
Testing: mysql-sha1-opencl, MySQL 4.1+ [SHA1 OpenCL]... /tmp/AMD_12531_579/t_12531_581.cl:139:18: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
Testing: NT-opencl [MD4 OpenCL]... /tmp/AMD_12531_613/t_12531_615.cl:332:18: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
Testing: raw-MD4-opencl [MD4 OpenCL]... /tmp/AMD_12531_868/t_12531_870.cl:247:18: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
Testing: raw-MD5-opencl [MD5 OpenCL]... /tmp/AMD_12531_885/t_12531_887.cl:271:18: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
Testing: raw-SHA1-opencl [SHA1 OpenCL]... /tmp/AMD_12531_902/t_12531_904.cl:139:18: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
Testing: salted-SHA1-opencl [SHA1 OpenCL]... /tmp/AMD_12531_970/t_12531_972.cl:151:41: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
Testing: SL3-opencl, Nokia operator unlock [SHA1 OpenCL]... /tmp/AMD_12531_1021/t_12531_1023.cl:151:41: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]

Notice that NT-opencl is affected by these warning and actually does not work in production and fails with:

/tmp/AMD_23440_18/t_23440_20.cl:332:18: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
                __attribute__((max_constant_size (NUM_INT_KEYS * 4)))
                               ^
1 warning generated.
Self test failed (cmp_one(0))

I cant test the others as they run fine in test mode even if test > 0

kochd avatar Dec 30 '18 08:12 kochd

Thanks, we should try to address this before a release. Perhaps it's time I set up a Linux boot option on my Macbook again so I can try out things like this.

magnumripper avatar Dec 30 '18 09:12 magnumripper

Current rocm @ 649bb4a337e69871dd9c852c2965d3722afa34df same error

kochd avatar Mar 15 '19 15:03 kochd

How do we know it's a rocm driver? Some minimum version number?

magnumripper avatar Apr 10 '19 20:04 magnumripper

just tested with rocm-dev 2.2.31 @ jumbo 97cce637cdf3796fc2a1b28117dfad4272a9e99f

kochd avatar Apr 10 '19 20:04 kochd

clinfo:

clinfo 
Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.1 AMD-APP (2833.0)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 


  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 1
  Device Type:					 CL_DEVICE_TYPE_GPU
  Vendor ID:					 1002h
  Board name:					 Ellesmere [Radeon RX 470/480]
  Device Topology:				 PCI[ B#11, D#0, F#0 ]
  Max compute units:				 36
  Max work items dimensions:			 3
    Max work items[0]:				 1024
    Max work items[1]:				 1024
    Max work items[2]:				 1024
  Max work group size:				 256
  Preferred vector width char:			 4
  Preferred vector width short:			 2
  Preferred vector width int:			 1
  Preferred vector width long:			 1
  Preferred vector width float:			 1
  Preferred vector width double:		 1
  Native vector width char:			 4
  Native vector width short:			 2
  Native vector width int:			 1
  Native vector width long:			 1
  Native vector width float:			 1
  Native vector width double:			 1
  Max clock frequency:				 1580Mhz
  Address bits:					 64
  Max memory allocation:			 7301444403
  Image support:				 Yes
  Max number of images read arguments:		 128
  Max number of images write arguments:		 8
  Max image 2D width:				 16384
  Max image 2D height:				 16384
  Max image 3D width:				 2048
  Max image 3D height:				 2048
  Max image 3D depth:				 2048
  Max samplers within kernel:			 26591
  Max size of kernel argument:			 1024
  Alignment (bits) of base address:		 1024
  Minimum alignment (bytes) for any datatype:	 128
  Single precision floating point capability
    Denorms:					 No
    Quiet NaNs:					 Yes
    Round to nearest even:			 Yes
    Round to zero:				 Yes
    Round to +ve and infinity:			 Yes
    IEEE754-2008 fused multiply-add:		 Yes
  Cache type:					 Read/Write
  Cache line size:				 64
  Cache size:					 16384
  Global memory size:				 8589934592
  Constant buffer size:				 7301444403
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 65536
  Max pipe arguments:				 16
  Max pipe active reservations:			 16
  Max pipe packet size:				 3006477107
  Max global variable size:			 7301444403
  Max global variable preferred total size:	 8589934592
  Max read/write image args:			 64
  Max on device events:				 1024
  Queue on device max size:			 8388608
  Max on device queues:				 1
  Queue on device preferred size:		 262144
  SVM capabilities:				 
    Coarse grain buffer:			 Yes
    Fine grain buffer:				 Yes
    Fine grain system:				 No
    Atomics:					 No
  Preferred platform atomic alignment:		 0
  Preferred global atomic alignment:		 0
  Preferred local atomic alignment:		 0
  Kernel Preferred work group size multiple:	 64
  Error correction support:			 0
  Unified memory for Host and Device:		 0
  Profiling timer resolution:			 1
  Device endianess:				 Little
  Available:					 Yes
  Compiler available:				 Yes
  Execution capabilities:				 
    Execute OpenCL kernels:			 Yes
    Execute native function:			 No
  Queue on Host properties:				 
    Out-of-Order:				 No
    Profiling :					 Yes
  Queue on Device properties:				 
    Out-of-Order:				 Yes
    Profiling :					 Yes
  Platform ID:					 0x7fa410a4af70
  Name:						 gfx803
  Vendor:					 Advanced Micro Devices, Inc.
  Device OpenCL C version:			 OpenCL C 2.0 
  Driver version:				 2833.0 (HSA1.1,LC)
  Profile:					 FULL_PROFILE
  Version:					 OpenCL 1.2 
  Extensions:					 cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program 

kochd avatar Apr 10 '19 20:04 kochd

Driver version: 2833.0 (HSA1.1,LC)

I suspect the HSA is a tell-tale. I'll go with that until someone says anything else. It's strange, however, that it's on device level only, it's not seen at platform level.

magnumripper avatar Apr 10 '19 21:04 magnumripper

Hmm but super has 2766.4 (PAL,HSAIL) - AMDGPU-Pro which is indeed AMD PRO and not ROCM (or can one be both? 😵 )

magnumripper avatar Apr 10 '19 21:04 magnumripper

Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program

So maybe some hint here. We have cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_amd_copy_buffer_p2p cl_amd_assembly_program

The AMDPRO driver on super reports this: Device extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes

That includes cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt

A wild guess could be that cl_amd_assembly_program indicates rocm, but who knows.

magnumripper avatar Apr 10 '19 21:04 magnumripper

Not sure I'll ever understand all AMD's acronyms. Looks like AMDPRO might include ROCr which is a ROCm runtime and maybe also a ROCK kernel driver (now with a captial K because that rocks) which may or may not be included in AMDPRO, who knows. I thought there were two different sets of AMD drivers: AMDPRO and ROCM. Apparently they are not that separate. Are they even the same thing, but one is for end users and the other for developers? I have absolutely no idea. I will leave this issue unattended - anyone who likes this mess is welcome to suggest fixes.

Oh, BTW we shouldn't need any fixes or workarounds, this is "OpenCL", isn't it? Yeah right 🙄

magnumripper avatar Apr 10 '19 21:04 magnumripper

Yes their acronyms are absolutely messed up.

  • rocm is their open source approach to just a OpenCL platform as the open source AMD graphics stack of linux lacked it completly.
  • AMDGPU-PRO is kinda a full stack driver for the GPU including OpenCL and its closed source.
  • There is a good chance that rocm was just stripped from AMDGPU-PRO and released

kochd avatar Apr 10 '19 22:04 kochd

Looks unrealistic we'll make relevant changes and properly re-test them on all platforms before the release. Changed milestone.

solardiz avatar May 11 '19 13:05 solardiz

rocm-dev 2.7.22 @ jumbo 88829829f23490c314c412f065d75d96cc394901 problem persists

john --test=0 --format=keepass-opencl
Device 1: gfx803 [Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]]
Testing: KeePass-opencl [SHA256 AES/Twofish/ChaCha OpenCL]... Options used: -I /root/git/JohnTheRipper/run/opencl -cl-mad-enable -D__GPU__ -DDEVICE_INFO=522 -D__SIZEOF_HOST_SIZE_T__=8 -DDEV_VER_MAJOR=2949 -DDEV_VER_MINOR=0 -D_OPENCL_COMPILER -DPLAINTEXT_LENGTH=124 -DHASH_LOOPS=100 -DMAX_CONT_SIZE=16777216 /root/JohnTheRipper/run/opencl/keepass_kernel.cl
Build log: : error: undefined hidden symbol: SHA256_Update
>>> referenced by /tmp/t_23061_33-6d6703.o:(keepass_init)
>>> referenced by /tmp/t_23061_33-6d6703.o:(keepass_init)
>>> referenced by /tmp/t_23061_33-6d6703.o:(keepass_final)
>>> referenced by /tmp/t_23061_33-6d6703.o:(keepass_final)
Error: Creating the executable from LLVM IRs failed.

Error building kernel /root/JohnTheRipper/run/opencl/keepass_kernel.cl. DEVICE_INFO=522
0: OpenCL CL_BUILD_PROGRAM_FAILURE (-11) error in opencl_common.c:1376 - clBuildProgram
# rar passes now
Testing: pgpdisk-opencl [SHA1 AES/TwoFish/CAST OpenCL]... Options used: -I /root/git/JohnTheRipper/run/opencl -cl-mad-enable -D__GPU__ -DDEVICE_INFO=522 -D__SIZEOF_HOST_SIZE_T__=8 -DDEV_VER_MAJOR=2949 -DDEV_VER_MINOR=0 -D_OPENCL_COMPILER -DPLAINTEXT_LENGTH=124 -DBINARY_SIZE=16 /root/JohnTheRipper/run/opencl/pgpdisk_kernel.cl
Build log: : error: undefined hidden symbol: pgpdisk_kdf
>>> referenced by /tmp/t_24544_33-59313a.o:(pgpdisk_aes)
>>> referenced by /tmp/t_24544_33-59313a.o:(pgpdisk_aes)
>>> referenced by /tmp/t_24544_33-59313a.o:(pgpdisk_twofish)
>>> referenced by /tmp/t_24544_33-59313a.o:(pgpdisk_twofish)
>>> referenced by /tmp/t_24544_33-59313a.o:(pgpdisk_cast)
>>> referenced by /tmp/t_24544_33-59313a.o:(pgpdisk_cast)
Error: Creating the executable from LLVM IRs failed.

Error building kernel /root/JohnTheRipper/run/opencl/pgpdisk_kernel.cl. DEVICE_INFO=522
0: OpenCL CL_BUILD_PROGRAM_FAILURE (-11) error in opencl_common.c:1376 - clBuildProgram
7z passes now

i cant use john --test=0 --format=opencl anymore to test them all. The first fail will stop the iteration.

NT-opencl still fails in production despite passing the test.

kochd avatar Sep 22 '19 00:09 kochd

@solardiz do these tests include rocm as platform ?

kochd avatar Sep 22 '19 00:09 kochd

@kochd What do you refer to by "these tests"? Personally, I don't test on rocm.

solardiz avatar Sep 22 '19 09:09 solardiz

Looks unrealistic we'll make relevant changes and properly re-test them on all platforms before the release.

kochd avatar Sep 22 '19 09:09 kochd

That comment applied to what we did before the 1.9.0-jumbo-1 release back in May, and it specifically excluded this issue from the pre-release testing. As in: we knew some formats fail tests on rocm, and were not going to fix that.

solardiz avatar Sep 22 '19 09:09 solardiz

@kochd Maybe you can contribute not only problem reports, but also fixes/workarounds? :-) You're on rocm, and you can try to troubleshoot and workaround the issues.

solardiz avatar Sep 22 '19 10:09 solardiz

I would but i have to admit that i am lacking the experience to help in that field. Also: Don't get me wrong - i am not trying generate any pressure here. My whole intent in posting updates is to give feedback instead of letting this issue become stale.

kochd avatar Sep 22 '19 10:09 kochd

@kochd We appreciate your feedback. It helps. Thanks!

solardiz avatar Sep 22 '19 10:09 solardiz

Yes it does help, the error: undefined hidden symbol is interesting and something we can start from. I would think it's a bug in the rocm runtime and not our code but I'll look at those kernels just to confirm we don't have some macro check messed up.

Oh and thanks @kochd for that explanation (back then) of what rocm is.

BTW do you have an AMD driver's "CPU device" to try as well? Maybe confirm whether that one is working or not. I expect it to work.

magnumripper avatar Sep 22 '19 20:09 magnumripper

When looking a the commits @magnumripper made since April to the rar und 7z kernels i dont see anything obvious that would have let to making rar and 7z pass now - which failed back in April. So rocm might have fixed something related to that issue. ROCm could have some compatibility issues with the failing kernels. However: I've used other software that makes use of OpenCL without encountering such problems.

Sadly I only have that GPU available:

 ./john --list=OpenCL-devices
Platform #0 name: AMD Accelerated Parallel Processing, version: OpenCL 2.1 AMD-APP (2949.0)
    Device #0 (1) name:     gfx803
    Board name:             Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
    Device vendor:          Advanced Micro Devices, Inc.
    Device type:            GPU (LE)
    Device version:         OpenCL 1.2 
    Driver version:         2949.0 (HSA1.1,LC) - AMDGPU-Pro  
    Native vector widths:   char 4, short 2, int 1, long 1
    Preferred vector width: char 4, short 2, int 1, long 1
    Global Memory:          8 GB
    Global Memory Cache:    16 KB
    Local Memory:           64 KB (Local)
    Constant Buffer size:   6963 MB
    Max memory alloc. size: 6963 MB
    Max clock (MHz):        1580
    Profiling timer res.:   1 ns
    Max Work Group Size:    256
    Parallel compute cores: 36
    Stream processors:      2304  (36 x 64)
    Speed index:            3640320
    SIMD width:             16
    Wavefront width:        64
    PCI device topology:    0b:00.0

Why do you think a CPU would work? AFAIK any Ryzen CPU would make use of ROCm aswell (https://rocm.github.io/hardware.html).

kochd avatar Sep 23 '19 02:09 kochd

When looking a the commits @magnumripper made since April to the rar und 7z kernels i dont see anything obvious that would have let to making rar and 7z pass now - which failed back in April. So rocm might have fixed something related to that issue. ROCm could have some compatibility issues with the failing kernels.

I came to the same conclusion.

However: I've used other software that makes use of OpenCL without encountering such problems.

Other software may well be better for rocm, I think hashcat actively tests rocm and they work around problems seen (that is unfortunately a very time consuming task, often involving changing random stuff and see what happens).

Why do you think a CPU would work? AFAIK any Ryzen CPU would make use of ROCm aswell (https://rocm.github.io/hardware.html).

The traditional AMD drivers include a CPU device and that one almost never has any problems. We routinely test(ed) on it on Travis/Circle CI when available. BTW when seeing AMD bugs, we're commonly only see them with specific GPU series or runtime versions while others may be perfectly fine. Even worse BTW when we work around a problem on one device or runtime version, we often introduce a whole different problem on some other device or runtime version. It's not very rewarding unfortunately :cry:

magnumripper avatar Sep 23 '19 06:09 magnumripper

Build log: : error: undefined hidden symbol: SHA256_Update

I have found that this is caused by the use of inline in the function declarations. ROCm uses clang to compile OpenCL code. According to clang's documentation: https://clang.llvm.org/compatibility.html#inline

By default, Clang builds C code in GNU C11 mode, so it uses standard C99 semantics for the inline keyword.

The above documentation explains the semantics of inline in C99. Basically, I was able to fix the issue by removing the inline keyword, or by replacing it with extern inline.

simon816 avatar Nov 05 '19 21:11 simon816

Thanks! That's good input.

magnumripper avatar Nov 05 '19 21:11 magnumripper

seems to be an issue again with ROCM 4.3.0

GIJack avatar Aug 08 '21 01:08 GIJack

seems to be an issue again with ROCM 4.3.0

Several disparate problems were discussed above: What exact issue are you seeing?

magnumripper avatar Aug 08 '21 14:08 magnumripper

I have found that this is caused by the use of inline in the function declarations.

Thanks! That's good input.

For the record, current code has this in opencl_misc.h:

#if __MESA__
#define inline  // empty!
#elif __POCL__
// Do nothing (POCL complains if we redefine)
#elif gpu_amd(DEVICE_INFO) // We really target ROCM here
#define inline  static inline
#else
// Do nothing
#endif

magnumripper avatar Aug 08 '21 14:08 magnumripper

Build log: : error: undefined hidden symbol: SHA256_Update

Just to clarify: Exact issue I am having and solution works.

Also clarifying it looks like you already got it. Thank you all!

edit: It might be worthwhile releasing a version 1.8.1 with bugfixes. Just a suggestion.

GIJack avatar Aug 08 '21 15:08 GIJack

edit: It might be worthwhile releasing a version 1.8.1 with bugfixes. Just a suggestion.

Yes (although it'd be 1.9.1 or possibly bumpin' major), and we are... we're just really slow about it 😆

Oh, and thanks for following up

magnumripper avatar Aug 08 '21 18:08 magnumripper

Hello! I had the same issue (I mean error: undefined hidden symbol), but after build from master branch it is solved! Anyway, there is another trouble - some tests are failing or even crashing john process.

$ uname -a

Linux localhost 5.10.70-1-lts #1 SMP Thu, 30 Sep 2021 09:43:10 +0000 x86_64 GNU/Linux

$ john --list=build-info

Version: 1.9.0-jumbo-1+bleeding-186c9ae1e 2021-10-05 14:51:31 +0200
Build: linux-gnu 64-bit x86_64 AVX AC MPI + OMP
SIMD: AVX, interleaving: MD4:3 MD5:3 SHA1:1 SHA256:1 SHA512:1
System-wide exec: /usr/bin
System-wide home: /usr/share/john
Private home: ~/.john
CPU tests: AVX
CPU fallback binary: john-non-avx
$JOHN is /usr/share/john/
Format interface version: 14
Max. number of reported tunable costs: 4
Rec file version: REC4
Charset file version: CHR3
CHARSET_MIN: 1 (0x01)
CHARSET_MAX: 255 (0xff)
CHARSET_LENGTH: 24
SALT_HASH_SIZE: 1048576
SINGLE_IDX_MAX: 2147483648
SINGLE_BUF_MAX: 4294967295
Effective limit: Number of salts vs. SingleMaxBufferSize
Max. Markov mode level: 400
Max. Markov mode password length: 30
gcc version: 11.1.0
GNU libc version: 2.33 (loaded: 2.33)
OpenCL headers version: 2.0
Crypto library: OpenSSL
OpenSSL library version: 0101010cf
OpenSSL 1.1.1l  24 Aug 2021
GMP library version: 6.2.1
File locking: fcntl()
fseek(): fseek
ftell(): ftell
fopen(): fopen
memmem(): System's
times(2) sysconf(_SC_CLK_TCK) is 100
Using times(2) for timers, resolution 10 ms
HR timer: clock_gettime(), latency 20 ns
Total physical host memory: 15370 MiB
Available physical host memory: 9638 MiB
Terminal locale string: en_GB.UTF-8
Parsed terminal locale: UTF-8

$ john --list=opencl-devices

Platform #0 name: AMD Accelerated Parallel Processing, version: OpenCL 2.0 AMD-APP.dbg (3305.0)
    Device #0 (1) name:     gfx902:xnack-
    Board name:             Renoir
    Device vendor:          Advanced Micro Devices, Inc.
    Device type:            GPU (LE)
    Device version:         OpenCL 2.0 
    Driver version:         3305.0 (HSA1.1,LC) - AMDGPU-Pro  
    Native vector widths:   char 4, short 2, int 1, long 1
    Preferred vector width: char 4, short 2, int 1, long 1
    Global Memory:          512 MiB
    Global Memory Cache:    16 KiB
    Local Memory:           64 KiB (Local)
    Constant Buffer size:   435 MiB
    Max memory alloc. size: 435 MiB
    Max clock (MHz):        1600
    Profiling timer res.:   1 ns
    Max Work Group Size:    256
    Parallel compute cores: 27
    Stream processors:      1728  (27 x 64)
    Speed index:            2764800
    SIMD width:             16
    Wavefront width:        64
    PCI device topology:    03:00.0

$ john --test=0 --format=opencl

Device 1@localhost: gfx902:xnack- [Renoir]
Warning: cryptosafe-opencl format should always be UTF-8. Use --target-encoding=utf8
Testing: cryptosafe-opencl [AES-256-CBC OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: sha1crypt-opencl, (NetBSD) [PBKDF1-SHA1 OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: KeePass-opencl [SHA256 AES/Twofish/ChaCha OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

FAILED (get_key(0))
Testing: oldoffice-opencl, MS Office <= 2003 [MD5/SHA1 RC4 OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: PBKDF2-HMAC-MD4-opencl [PBKDF2-MD4 OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: PBKDF2-HMAC-MD5-opencl [PBKDF2-MD5 OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: PBKDF2-HMAC-SHA1-opencl [PBKDF2-SHA1 OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: rar-opencl, RAR3 (length 5) [SHA1 OpenCL AES]... (8xOMP) Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: RAR5-opencl [PBKDF2-SHA256 OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: TrueCrypt-opencl [RIPEMD160 AES256_XTS OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: lotus5-opencl, Lotus Notes/Domino 5 [OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: AndroidBackup-opencl [PBKDF2-SHA1 AES OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: agilekeychain-opencl, 1Password Agile Keychain [PBKDF2-SHA1 AES OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: ansible-opencl, Ansible Vault [PBKDF2-SHA256 HMAC-SHA256 OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: axcrypt-opencl [SHA1 AES OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Testing: axcrypt2-opencl, AxCrypt 2.x [PBKDF2-SHA512 AES OpenCL]... Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

PASS
Build log: /tmp/comgr-c5a8a0/input/CompileSource:117:17: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
        __attribute__((max_constant_size(16)))
                       ^~~~~~~~~~~~~~~~~~~~~
/tmp/comgr-c5a8a0/input/CompileSource:121:17: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
        __attribute__((max_constant_size(72)))
                       ^~~~~~~~~~~~~~~~~~~~~
/tmp/comgr-c5a8a0/input/CompileSource:129:17: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes]
        __attribute__((max_constant_size(4096)))
                       ^~~~~~~~~~~~~~~~~~~~~~~
3 warnings generated.
warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument]
1 warning generated.

LWS=8 GWS=1024
Testing: bcrypt-opencl ("$2a$05", 32 iterations) [Blowfish OpenCL]... FAILED (cmp_exact(1))
[localhost:365838] *** Process received signal ***
[localhost:365838] Signal: Bus error (7)
[localhost:365838] Signal code:  (128)
[localhost:365838] Failing at address: (nil)
[localhost:365838] [ 0] /usr/lib64/libpthread.so.0(+0x13870)[0x7f0b72cf3870]
[localhost:365838] [ 1] /usr/lib64/libc.so.6(+0x8827f)[0x7f0b72b9a27f]
[localhost:365838] [ 2] /usr/lib64/libc.so.6(cfree+0x68)[0x7f0b72b9d9e8]
[localhost:365838] [ 3] john(+0x3ecdc3)[0x55bf9d8c9dc3]
[localhost:365838] [ 4] john(+0x3ecc6e)[0x55bf9d8c9c6e]
[localhost:365838] [ 5] john(+0x345690)[0x55bf9d822690]
[localhost:365838] [ 6] john(+0x336c45)[0x55bf9d813c45]
[localhost:365838] [ 7] john(+0x350fba)[0x55bf9d82dfba]
[localhost:365838] [ 8] /usr/lib64/libc.so.6(__libc_start_main+0xd5)[0x7f0b72b39b25]
[localhost:365838] [ 9] john(+0xa008e)[0x55bf9d57d08e]
[localhost:365838] *** End of error message ***
[1]    365838 bus error (core dumped)  john --test=0 --format=opencl

g00g1 avatar Oct 05 '21 14:10 g00g1

Build log: warning: argument unused during compilation: '-I opencl' [-Wunused-command-line-argument] 1 warning generated.

This warning simply can't be correct because if the compiler didn't obey it, it wouldn't be able to build the kernels.

Build log: /tmp/comgr-c5a8a0/input/CompileSource:117:17: warning: unknown attribute 'max_constant_size' ignored [-Wunknown-attributes] attribute((max_constant_size(16))) ^~~~~~~~~~~~~~~~~~~~~

Perhaps we should just drop all such __attribute__((max_constant_size(n))) lines now. I doubt they were ever needed on any gear, they were just a compiler hint, right?

Other than that, unfortunately we've become used with AMD drivers failing with many formats and working around their bugs is a Sisyphus work. Also bcrypt-opencl currently lacks a maintainer.

magnumripper avatar Oct 05 '21 19:10 magnumripper