OpenBLAS icon indicating copy to clipboard operation
OpenBLAS copied to clipboard

test failed for ctbsv on blas tester

Open qqqil opened this issue 3 years ago • 9 comments

bin/xcl2blastst -R tbsv -U 1 U -A 1 N -D 2 U N -n 4 -X 2 2 1 -q 3

bin/xcl2blastst -R tbsv -U 1 U -A 1 N -D 2 U N -n 4 -X 2 2 1 -q 3

----------------------------- TBSV ------------------------------ TST# UPLO TRAN DIAG N K LDA INCX TIME MFLOP SpUp TEST ==== ==== ==== ==== ==== ==== ==== ==== ====== ====== ===== ===== ERROR: resid=98473.945312, normD=0.469560, normA=0.393167, normX=0.539283, eps=1.192093e-06 resid=98473.945312 0 U N U 4 3 4 2 0.00 5.6 1.00 ----- 0 U N U 4 3 4 2 0.00 3.1 0.56 FAIL 1 U N U 4 3 4 1 0.00 75.5 1.00 ----- 1 U N U 4 3 4 1 0.00 75.5 1.00 PASS ERROR: resid=28358.013672, normD=0.188386, normA=1.393167, normX=0.539283, eps=1.192093e-06 resid=28358.013672 2 U N N 4 3 4 2 0.00 0.0 1.00 ----- 2 U N N 4 3 4 2 0.00 75.5 0.00 FAIL 3 U N N 4 3 4 1 0.00 0.0 1.00 ----- 3 U N N 4 3 4 1 0.00 75.5 0.00 PASS

4 tests run, 2 passed

qqqil avatar Aug 06 '21 02:08 qqqil

Hi, What kind of CPU you are using (/proc/cpuinfo)? Which version of OpenBLAS? (like last block from compilation output) Parameters given to CMAKE or MAKE? Compilers? (if from system - just roughly version of that system)

Please copy over extra information you gave in nearly identical report at https://github.com/xianyi/BLAS-Tester/issues/5

brada4 avatar Aug 06 '21 05:08 brada4

Sandybridge

bin/xcl2blastst -R tbsv -U 1 U -A 1 N -D 2 U N -n 4 -X 2 2 1 -q 3


bin/xcl2blastst -R tbsv -U 1 U -A 1 N -D 2 U N -n 4 -X 2 2 1 -q 3 


----------------------------- TBSV ------------------------------
TST# UPLO TRAN DIAG    N    K  LDA INCX   TIME  MFLOP  SpUp  TEST
==== ==== ==== ==== ==== ==== ==== ==== ====== ====== ===== =====
   0    U    N    U    4    3    4    2   0.00    7.9  1.00 -----
   0    U    N    U    4    3    4    2   0.00    1.2  0.15 PASS 
   1    U    N    U    4    3    4    1   0.00   25.2  1.00 -----
   1    U    N    U    4    3    4    1   0.00   25.2  1.00 PASS 
   2    U    N    N    4    3    4    2   0.00   37.7  1.00 -----
   2    U    N    N    4    3    4    2   0.00   14.4  0.38 PASS 
   3    U    N    N    4    3    4    1   0.00   37.7  1.00 -----
   3    U    N    N    4    3    4    1   0.00   17.8  0.47 PASS 

4 tests run, 4 passed

brada4 avatar Aug 06 '21 05:08 brada4

Yes please name your cpu model and the OpenBLAS version you tried. (Not reproduced with current version on Haswell, Nehalem, ARMV8 or CortexA57 either.)

martin-frbg avatar Aug 06 '21 22:08 martin-frbg

hi, martin-frbg the cpu model is aarch64, OpenBLAS lastest version 0.3.17, TSV110 gcc -v 使用内建 specs。 COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-linux-gnu/7.3.0/lto-wrapper 目标:aarch64-linux-gnu 配置为:../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,fortran,lto --enable-plugin --enable-initfini-array --disable-libgcj --without-isl --without-cloog --enable-gnu-indirect-function --build=aarch64-linux-gnu --with-stage1-ldflags=' -Wl,-z,relro,-z,now' --with-boot-ldflags=' -Wl,-z,relro,-z,now' --with-multilib-list=lp64 线程模型:posix gcc 版本 7.3.0 (GCC)

qqqil avatar Aug 09 '21 03:08 qqqil

Thank you - this is a bit strange, as CTBSV should depend only on CDOT and CAXPY, and TSV110 does not have a cpu-specific optimized BLAS kernel for either.

martin-frbg avatar Aug 09 '21 07:08 martin-frbg

Not reproduced on ThunderX2T99 (same CAXPY kernel, different CDOT) with gcc 7.5.0. Retriggering the CI job now until it gets scheduled on Falkor hardware (sadly I do not have access to an actual TSV110)

martin-frbg avatar Aug 10 '21 12:08 martin-frbg

TSV110 (tuning/arch) support appears in mainstream gcc 9.1.0

Older compilers are likely to have some backport effort at varied quality (7.3.0 looks like ubuntu 18.04.0 without patches?)

Coud you try to repeat the tests with TARGET=ARMV8 , just to exclude any mistakes at gcc7 tsv110 support?

brada4 avatar Aug 10 '21 20:08 brada4

Not reproduced in multiple runs on the 96-core ThunderX provided by drone.io CI - not even when building for TSV110 (or ARMV8/CortexA57) on that hardware. So I suspect either compiler or hardware problem in your case

martin-frbg avatar Aug 11 '21 17:08 martin-frbg

Managed to drop the gcc version in the CI job down to 7.3.0, and still cannot reproduce this.

martin-frbg avatar Aug 13 '21 07:08 martin-frbg