OpenBLAS icon indicating copy to clipboard operation
OpenBLAS copied to clipboard

cblas_sgemm return all zero

Open odieXin opened this issue 8 years ago • 7 comments

Hi Xianyi,

We tried to run a matrix multiplication with cblas_sgemm or cblas_dgemm on android. We tried with A = [1 3 4 6], B = [3 5 9 1], and C = A * B. We initialized C with all zero. The result of C did not end up with A * B, but remains all zero. Do you have any hints? Thanks!

odieXin avatar Feb 08 '17 18:02 odieXin

Could you please name the version of OpenBLAS you used, what kind of android hardware (cpu) and perhaps post the code snippet with your cblas_sgemm call just in case you got a calling parameter wrong ? What you could do (if you have not already done so) is confirm that your code works with some other implementation of (c)blas, e.g. the netlib reference implementation.

martin-frbg avatar Feb 08 '17 19:02 martin-frbg

@martin-frbg

  1. We used the latest OpenBlas code

  2. We tried with two devices and end up with the same issue: (1) Google Pixel Qualcomm Snapdragon 821 (2) Samsung G920

  3. Test Code: float A[4] = {1, 2, 3, 4}; float B[4] = {1, 1, 1, 1}; float C[4] = {0, 0, 0, 0}; int M = 2; int N = 2; int K = 2; float alpha = 1.0f; float beta = 0.0f; int lda = 2; int ldb = 2; cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, M, N, K, alpha, A, lda, B, ldb, beta, C, N);

  4. How we compiled OpenBlas on Android (We have compiled on both Ubuntu and Mac OS, and got the same results):

set( CMAKE_SHARED_LINKER_FLAGS "-Wl,--no-warn-mismatch -lm_hard ${CMAKE_SHARED_LINKER_FLAGS}" ) add_library(lib_openblas STATIC IMPORTED) set_target_properties(lib_openblas PROPERTIES IMPORTED_LOCATION ../../../../../thirdparty/OpenBLAS/lib/armeabi/libopenblas.a)

target_link_libraries( # Specifies the target library. native-lib ${log-lib} lib_openblas)

  1. We have tried that several functions actually work: (1) cblas_copy (2) cblas_swap

  2. Two major functions that don't work: (1) cblas_sgemm (2) cblas_dgemm

Error: Initialize C with zero and always return zero

odieXin avatar Feb 09 '17 05:02 odieXin

I am not really familiar with android compilation. So will ask further on the below statement as I can help more on Ubuntu.

"We have compiled on both Ubuntu and Mac OS , and got the same results" Can you please list out the steps you used to compile ?

ashwinyes avatar Feb 09 '17 06:02 ashwinyes

@ashwinyes

Steps we used for compile:

  1. Install Android Studio and download ndk.

  2. Download latest OpenBLSA code with git clone https://github.com/xianyi/OpenBLAS.git

  3. Compile OpenBLAS with following command: export NDK_ROOT=path_to_ndk_bundle export APP_ABI=android-21 export CFLAGS="--sysroot=${NDK_ROOT}/platforms/${APP_ABI}/arch-arm" export PATH=$NDK_ROOT/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin:$PATH make TARGET=ARMV7 HOSTCC=gcc CC=arm-linux-androideabi-gcc NOFORTRAN=1 make PREFIX=dist/armeabi/ install

  4. Create a android project in Android Studio with c++ support.

  5. Copy OpenBLAS headers and libs to Android project.

  6. Add test code in android cpp file. float A[4] = {1, 2, 3, 4}; float B[4] = {1, 1, 1, 1}; float C[4] = {0, 0, 0, 0}; int M = 2; int N = 2; int K = 2; float alpha = 1.0f; float beta = 0.0f; int lda = 2; int ldb = 2;

    cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, M, N, K, alpha, A, lda, B, ldb, beta, C, N);

  7. Compile android project with following config in CMakeLists.txt :

link open blas

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mhard-float -D_NDK_MATH_NO_SOFTFP=1") set( CMAKE_SHARED_LINKER_FLAGS "-Wl,--no-warn-mismatch -lm_hard ${CMAKE_SHARED_LINKER_FLAGS}" ) add_library(lib_openblas STATIC IMPORTED) set_target_properties(lib_openblas PROPERTIES IMPORTED_LOCATION ../../../../../thirdparty/OpenBLAS/lib/armeabi/libopenblas.a)

target_link_libraries( # Specifies the target library. native-lib

                   # Links the target library to the log library
                   # included in the NDK.
                   ${log-lib}
                    lib_openblas)
  1. Debug android project with android devices and found that the values in C are never changed.

odieXin avatar Feb 09 '17 21:02 odieXin

Thanks for the detailed steps. I dont have a arm32 machine to test on.

I tested on aarch64 machine for TARGET=ARMV8 and the result is correct.

Since the devices you are testing are also aarch64 processors, could you please try compiling for TARGET=ARMV8 using android aarch64 compilers and let us know the results.

Thanks

ashwinyes avatar Feb 11 '17 03:02 ashwinyes

There was a change in the register zeroing code of the s/dgemm_kernel_4x4_vfpv3.S functions for ARMv7 a year ago when it was found that the previous code could not clear a spurious NaN (see #740), perhaps it would make sense to go back to (git checkout) 5f2fa15 for a quick test if you need to get this working on ARMv7 ?

martin-frbg avatar Feb 11 '17 12:02 martin-frbg

I note strangeness in cblas_sgemm was also observed in #1014 although the circumstances there were less clear.

martin-frbg avatar Feb 12 '17 13:02 martin-frbg