`./frame/base/bli_cntx.h` Requested index is out of bounds when running `test_libblis.x`
$ ./test_libblis.x -g ./testsuite/input.general -o ./testsuite/input.operations
% detected -g option; using "./testsuite/input.operations" for parameters filename.
% detected -o option; using "./testsuite/input.operations" for operations filename.
libblis: ./frame/base//bli_cntx.h (line 148):
libblis: Requested index is out of bounds.
libblis: Aborting.
1: 0x1A81BA6E bli_abort+0x76
2: 0x1A825F80 bli_cntx_init_generic_ref+0x151c
3: 0x1A7E5C78 bli_gks_cntx_ukr_is_ref+0xd4
4: 0x1A7E829E bli_ind_init+0x7c
5: 0x1A7F183E bli_pthread_switch_on+0xaa
6: 0x1A7DC8E8 bli_init_once+0x8a
7: 0x1A750D88 bli_thread_get_thread_impl+0x44
8: 0x1A74D364 libblis_test_output_params_struct+0x134
9: 0x1A74BA96 libblis_test_read_params_file+0x1350
10: 0x1ABB2ADE main+0x142
11: 0x1ABB2EFA CELQINIT+0x1aca [CELQINIT:]
I am currently working on porting BLIS to z/OS. I was able to get BLIS 1.1 to build and run with a few patches. But I got this error when attempting to upgrade BLIS to 2.0. Any pointers on where I should look at would be greatly appreciated!
@devinamatthews Hello Devin, do you have any idea on what possibly went wrong? Thank you in advance!
Can you please send the full output of configure, and the contents of bli_config.h?
@devinamatthews Thank you very much for getting back to me!
Here is the full output of configure:
configure: detected OS/390 kernel version 29.00.
configure: python interpreter search list is: python python3 python2.
configure: found 'python'.
configure: using 'python' as python interpreter.
configure: found python version 3.12.10 (maj: 3, min: 12, rev: 10).
configure: python 3.12.10 appears to be supported.
configure: user specified a C compiler via CC (clang).
configure: clang exists and appears to work.
configure: using 'clang' as C compiler.
configure: found clang version 14.0.0 (maj: 14, min: 0, rev: 0).
configure: checking for blacklisted configurations due to clang 14.0.0.
configure: checking clang 14.0.0 against known consequential version ranges.
configure: found assembler ('as') version (maj: , min: , rev: ).
configure: checking for blacklisted configurations due to as .
configure: warning: assembler ('as' ) does not support 'bulldozer'; adding to blacklist.
configure: warning: assembler ('as' ) does not support 'sandybridge'; adding to blacklist.
configure: warning: assembler ('as' ) does not support 'haswell'; adding to blacklist.
configure: warning: assembler ('as' ) does not support 'piledriver'; adding to blacklist.
configure: warning: assembler ('as' ) does not support 'steamroller'; adding to blacklist.
configure: warning: assembler ('as' ) does not support 'excavator'; adding to blacklist.
configure: warning: assembler ('as' ) does not support 'skx'; adding to blacklist.
configure: warning: assembler ('as' ) does not support 'knl'; adding to blacklist.
configure: configuration blacklist:
configure: bulldozer sandybridge haswell piledriver steamroller excavator skx knl
configure: user specified a C++ compiler via CXX (clang++).
configure: clang++ exists and appears to work.
configure: using 'clang++' as C++ compiler.
configure: Fortran compiler search list is: gfortran ifort ifx nvfortran.
configure: *** Could not find a Fortran compiler from the search list.
configure: *** Note that a Fortran compiler will not be available.
configure: library archiver search list is: ar.
configure: found 'ar'.
configure: using 'ar' as library archiver.
configure: user specified a archive indexer via RANLIB (echo).
configure: echo exists and appears to work.
configure: using 'echo' as archive indexer.
configure: reading configuration registry...done.
configure: determining default version string.
configure: found '.git' directory; assuming git clone.
configure: executing: git describe --tags --abbrev=0.
configure: got back 2.0.
configure: truncating to 2.0.
configure: starting configuration of BLIS 2.0.
configure: configuring with official version string.
configure: found shared library .so version '4.0.0'.
configure: .so major version: 4
configure: .so minor.build version: 0.0
configure: manual configuration requested; configuring with 'generic'.
configure: checking configuration against contents of 'config_registry'.
configure: configuration 'generic' is registered.
configure: 'generic' is defined as having the following sub-configurations:
configure: generic
configure: which collectively require the following kernels:
configure: generic
configure: checking sub-configurations:
configure: 'generic' is registered...and exists.
configure: checking sub-configurations' requisite kernels:
configure: 'generic' kernels...exist.
configure: detected --prefix='/data/zopen/usr/local/zopen/blis/blis-2.0'.
configure: no install exec_prefix option given; defaulting to PREFIX.
configure: no install libdir option given; defaulting to EXECPREFIX/lib.
configure: no install includedir option given; defaulting to PREFIX/include.
configure: no install sharedir option given; defaulting to PREFIX/share.
configure: final installation directories:
configure: prefix: /data/zopen/usr/local/zopen/blis/blis-2.0
configure: exec_prefix: /data/zopen/usr/local/zopen/blis/blis-2.0
configure: libdir: /data/zopen/usr/local/zopen/blis/blis-2.0/lib
configure: includedir: /data/zopen/usr/local/zopen/blis/blis-2.0/include
configure: sharedir: /data/zopen/usr/local/zopen/blis/blis-2.0/share
configure: NOTE: the variables above can be overridden when running make.
configure: detected preset CFLAGS; prepending:
configure: -std=gnu11 -fzos-le-char-mode=ascii -mnocsect -fno-short-enums -m64 -mzos-target=zosv2r5 -march=z13 -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -O3 -I/data/zopen/usr/local/zopen/curl/curl-8.15.0.20250728_192752.zos/include -I/data/zopen/usr/local/zopen/libpsl/libpsl-master.20250604_154033.zos/include -I/data/zopen/usr/local/zopen/libssh2/libssh2-1.11.0.20240103_144102.zos/include -I/data/zopen/usr/local/zopen/openssl/openssl-3.5.1.20250709_113621.zos/include -I/data/zopen/usr/local/zopen/ncurses/ncurses-6.5.20250625_114034.zos/include -isystem /data/zopen/usr/local/zopen/zoslib/zoslib-zopen2.20250725_124302.zos/include -include /data/zopen/usr/local/zopen/zoslib/zoslib-zopen2.20250725_124302.zos/include/zos-v2r5-symbolfixes.h -finstrument-functions -DNSIG=42 -D_XOPEN_SOURCE=600 -D_ALL_SOURCE -D_OPEN_SYS_FILE_EXT=1 -D_AE_BIMODAL=1 -D_ENHANCED_ASCII_EXT=0xFFFFFFFF -DZOSLIB_OVERRIDE_CLIB=1 -DZOSLIB_OVERRIDE_CLIB_GETENV=1 -DZOSLIB_USE_CLIB_LOCALE=1
configure: detected preset CXXFLAGS; prepending:
configure: -fzos-le-char-mode=ascii -mnocsect -fno-short-enums -m64 -mzos-target=zosv2r5 -march=z13 -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -O3 -I/data/zopen/usr/local/zopen/curl/curl-8.15.0.20250728_192752.zos/include -I/data/zopen/usr/local/zopen/libpsl/libpsl-master.20250604_154033.zos/include -I/data/zopen/usr/local/zopen/libssh2/libssh2-1.11.0.20240103_144102.zos/include -I/data/zopen/usr/local/zopen/openssl/openssl-3.5.1.20250709_113621.zos/include -I/data/zopen/usr/local/zopen/ncurses/ncurses-6.5.20250625_114034.zos/include -I/data/zopen/usr/local/zopen/ncurses/ncurses-6.5.20250625_114034.zos/include/ncurses -isystem /data/zopen/usr/local/zopen/zoslib/zoslib-zopen2.20250725_124302.zos/include -include /data/zopen/usr/local/zopen/zoslib/zoslib-zopen2.20250725_124302.zos/include/zos-v2r5-symbolfixes.h -finstrument-functions
configure: detected preset LDFLAGS; prepending:
configure: -Wl,-bedit=no -m64 -L/data/zopen/usr/local/zopen/curl/curl-8.15.0.20250728_192752.zos/lib -L/data/zopen/usr/local/zopen/libpsl/libpsl-master.20250604_154033.zos/lib -L/data/zopen/usr/local/zopen/libssh2/libssh2-1.11.0.20240103_144102.zos/lib -L/data/zopen/usr/local/zopen/openssl/openssl-3.5.1.20250709_113621.zos/lib -L/data/zopen/usr/local/zopen/ncurses/ncurses-6.5.20250625_114034.zos/lib -L/data/zopen/usr/local/zopen/zoslib/zoslib-zopen2.20250725_124302.zos/lib
configure: disabling verbose make output. (enable with 'make V=1'.)
configure: disabling ARG_MAX hack.
configure: debug symbols disabled.
configure: AddressSanitizer support disabled.
configure: building BLIS as a static library (shared library disabled).
configure: enabling operating system support.
configure: enabling thread-local storage (TLS) support.
configure: enabling support for single-threading.
configure: requesting slab work partitioning in jr and/or ir loops.
configure: internal memory pools for packing blocks are enabled.
configure: internal memory pools for small blocks are enabled.
configure: memory tracing output is disabled.
configure: ScaLAPACK compatibility is disabled.
configure: libmemkind not found; disabling.
configure: compiler appears to support #pragma omp simd.
configure: the BLAS compatibility layer is enabled.
configure: the CBLAS compatibility layer is disabled.
configure: sup (skinny/unpacked) matrix handling is enabled.
configure: trsm diagonal element pre-inversion is enabled.
configure: the BLIS API integer size is automatically determined.
configure: the BLAS/CBLAS API integer size is 32-bit.
configure: AMD-specific framework files will not be considered.
configure: configuring with no addons.
configure: configuring for conventional gemm implementation.
configure: unable to determine Fortran compiler vendor!
configure: configuring complex return type as "gnu".
configure: creating ./config.mk from ./build/config.mk.in
configure: creating ./bli_config.h from ./build/bli_config.h.in
configure: creating ./bli_addon.h from ./build/bli_addon.h.in
configure: creating ./obj/generic
configure: creating ./obj/generic/config/generic
configure: creating ./obj/generic/kernels/generic
configure: creating ./obj/generic/ref_kernels/generic
configure: creating ./obj/generic/frame
configure: creating ./obj/generic/blastest
configure: creating ./obj/generic/testsuite
configure: creating ./lib/generic
configure: creating ./include/generic
configure: mirroring ./config/generic to ./obj/generic/config/generic
configure: mirroring ./kernels/generic to ./obj/generic/kernels/generic
configure: mirroring ./ref_kernels to ./obj/generic/ref_kernels
configure: mirroring ./ref_kernels to ./obj/generic/ref_kernels/generic
configure: mirroring ./frame to ./obj/generic/frame
configure: creating makefile fragments in ./obj/generic/config/generic
configure: creating makefile fragments in ./obj/generic/kernels/generic
configure: creating makefile fragments in ./obj/generic/ref_kernels
configure: creating makefile fragments in ./obj/generic/frame
configure: configured to build within top-level directory of source distribution.
And the content of bli_config.h:
/*
BLIS
An object-based framework for developing high-performance BLAS-like
libraries.
Copyright (C) 2014, The University of Texas at Austin
Copyright (C) 2018 - 2019, Advanced Micro Devices, Inc.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
- Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
- Neither the name(s) of the copyright holder(s) nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef BLIS_CONFIG_H
#define BLIS_CONFIG_H
// Enabled configuration "family" (config_name)
#define BLIS_FAMILY_GENERIC
// Enabled sub-configurations (config_list)
#define BLIS_CONFIG_GENERIC
// Enabled kernel sets (kernel_list)
#define BLIS_KERNELS_GENERIC
#define BLIS_VERSION_STRING "2.0"
#define BLIS_VERSION_MAJOR 2
#define BLIS_VERSION_MINOR 0
#define BLIS_VERSION_REVISION 0
#if 1
#define BLIS_ENABLE_SYSTEM
#else
#define BLIS_DISABLE_SYSTEM
#endif
#if 1
#define BLIS_ENABLE_TLS
#else
#define BLIS_DISABLE_TLS
#endif
#if 0
#define BLIS_ENABLE_OPENMP
#if 0
#define BLIS_ENABLE_OPENMP_AS_DEFAULT
#endif
#endif
#if 0
#define BLIS_ENABLE_PTHREADS
#if 0
#define BLIS_ENABLE_PTHREADS_AS_DEFAULT
#endif
#endif
#if 0
#define BLIS_ENABLE_HPX
#if 0
#define BLIS_ENABLE_HPX_AS_DEFAULT
#endif
#endif
#if 1
#define BLIS_ENABLE_JRIR_SLAB
#endif
#if 0
#define BLIS_ENABLE_JRIR_RR
#endif
#if 0
#define BLIS_ENABLE_JRIR_TLB
#endif
#if 1
#define BLIS_ENABLE_PBA_POOLS
#else
#define BLIS_DISABLE_PBA_POOLS
#endif
#if 1
#define BLIS_ENABLE_SBA_POOLS
#else
#define BLIS_DISABLE_SBA_POOLS
#endif
#if 0
#define BLIS_ENABLE_MEM_TRACING
#else
#define BLIS_DISABLE_MEM_TRACING
#endif
#if 0
#define BLIS_ENABLE_SCALAPACK_COMPAT
#else
#define BLIS_DISABLE_SCALAPACK_COMPAT
#endif
#if 0 == 64
#define BLIS_INT_TYPE_SIZE 64
#elif 0 == 32
#define BLIS_INT_TYPE_SIZE 32
#else
// determine automatically
#endif
#if 32 == 64
#define BLIS_BLAS_INT_TYPE_SIZE 64
#elif 32 == 32
#define BLIS_BLAS_INT_TYPE_SIZE 32
#else
// determine automatically
#endif
#ifndef BLIS_ENABLE_BLAS
#ifndef BLIS_DISABLE_BLAS
#if 1
#define BLIS_ENABLE_BLAS
#else
#define BLIS_DISABLE_BLAS
#endif
#endif
#endif
#ifndef BLIS_ENABLE_CBLAS
#ifndef BLIS_DISABLE_CBLAS
#if 0
#define BLIS_ENABLE_CBLAS
#else
#define BLIS_DISABLE_CBLAS
#endif
#endif
#endif
#if 1
#define BLIS_ENABLE_SUP_HANDLING
#else
#define BLIS_DISABLE_SUP_HANDLING
#endif
#if 0
#define BLIS_ENABLE_MEMKIND
#else
#define BLIS_DISABLE_MEMKIND
#endif
#if 1
#define BLIS_ENABLE_TRSM_PREINVERSION
#else
#define BLIS_DISABLE_TRSM_PREINVERSION
#endif
#if 1
#define BLIS_ENABLE_PRAGMA_OMP_SIMD
#else
#define BLIS_DISABLE_PRAGMA_OMP_SIMD
#endif
#if 0
#define BLIS_ENABLE_SANDBOX
#else
#define BLIS_DISABLE_SANDBOX
#endif
#if 0
#define BLIS_ENABLE_SHARED
#else
#define BLIS_DISABLE_SHARED
#endif
#if 0
#define BLIS_ENABLE_COMPLEX_RETURN_INTEL
#else
#define BLIS_DISABLE_COMPLEX_RETURN_INTEL
#endif
#endif
Just for your context, after some tracing I found that the testsuite failed at bli_thread_get_thread_impl(). Is this a indication that some of the threading configuration is not set properly?
@jerryyiransun doesn't look like I'm going to be able to reproduce this on another architecture. Does the problem still happen when configured with --enable-debug=opt? That would help narrow down the problem. Whether an issue with architecture selection (as indicated by your intial backtrace) or threading backend selection (from your later comment), the real issue is almost certainly occurring sooner, and may be due to stack corruption or UB.
If you can nail down the precise location of the problem (at least where bli_abort is being called from), I can work with you to work backwards towards the source. I imagine we'll need to inspect local state as we go (printf or via gdb).
@devinamatthews the problem still exists when configured with --enable-debug=opt. I do have a tool that allows us to nail down to the precise location of the problem.
If you open the following trace file in https://ui.perfetto.dev/. You should see:
Hi @jerryyiransun can you get a gdb-style backtrace, but with files and line numbers? I'm still confused because this info and another comment indicate a problem in bli_stack_get, but the first stack trace implicates bli_cntx_init_generic_ref.
Hey @devinamatthews, I've opened an PR for fixing the issue, it had to do with some uninitialized memory and was causing the crash on z/OS.