lapack Allow a possible installation of index-64 library alongside standard index-32 library?

Currently debian has (some) separate headers for blas64 and cblas64 (though not from the reference implementation).

I am not sure if they are correct or not, with regards to the reference index64 API (they are from the blis library).

Would it be possible to add an option to cmake, something like BUILD_INDEX64, which is defaulted to OFF, but if has been turned on, it will create the index-64 libraries?
If I make a PR for such an option, would it be entertained as a possibility?

Some things which I had in mind for allowing this to co-exist with the standard installation - name the libraries to libblas64.so, libcblas64.so, liblapack64.so, liblapacke64.so, this way there is no conflict between the library names (though of course, you can't link with both libblas and libblas64 at the same time). Also the library would need to be compiled twice, once for index32 and once for index64 (but that is a perfectly normal scenario and not a deal breaker). The only conflict I am unable to think of a good resolution is the header file names. If following debians style, it might be wise to call the c headers ending with 64 as well. (I'm maintaining this for Gentoo and would like to keep the ecosystem very close to debian, so that we have minimal problems for developers switching between systems)

I'm open to any suggestions before I make the PR :heart:

Thanks, Aisha

Nov 01 '20 01:11 epsilon-0

Hi Aisha, that makes total sense to me, but let us see if we have feedback from others. So let us wait a few days. J

Nov 01 '20 01:11 langou

Most definitely, that sounds like a good plan.

Nov 01 '20 01:11 epsilon-0

OMG @langou you are so fast :heart:

Just for completeness, I am writing down things that we still have to do:

Figure out how to name headers so that 32bit API can co-exist with 64bit API
Fix printf/fprintf statements so that they use correct qualifier for printing.

Any suggestions to solve the first point are welcome, I unfortunately don't have a "clean" solution.

Nov 05 '20 17:11 epsilon-0

A couple of questions, which will help me handle the naming for the files

There seems to be a tonne of duplicate definitions between cblas_f77.h and cblas_test.h. Do we really need that?
Does cblas_test.h need to be installed? Given its name (and the files that it is used in) I presume it to be used only during the testing phase. Maybe we should not install this file on a system-wide level?

Nov 06 '20 23:11 epsilon-0

Hi @epsilon-0,

Some things which I had in mind for allowing this to co-exist with the standard installation - name the libraries to libblas64.so, libcblas64.so, liblapack64.so, liblapacke64.so, this way there is no conflict between the library names (though of course, you can't link with both libblas and libblas64 at the same time).

you might be looking for PR #218. The author of this PR is Björn Esser from the Fedora Project.

Feb 15 '21 08:02 christoph-conrads

Hi @epsilon-0. Did #462 solve this issue?

Apr 21 '21 20:04 weslleyspereira

@weslleyspereira no this isn't complete yet. There needs to be some more header renaming/handling done. I am busy for the next few weeks so am not going to be able to do this soon. Basic outline

the header files should be called cblas.h and cblas64.h, likewise for other headers
- this means the *.c files would need some slight adjustment to include the proper header, but this is only during build time, so it can be hacked out.
the cmake files should be installed under lapack64 or cblas64, etc.

Apr 21 '21 20:04 epsilon-0

Ok, I see. Thanks for the quick follow-up!

Apr 21 '21 20:04 weslleyspereira

I have similar trouble trying to package things with pkgsrc. I would like to have a complete install of the reference, with cblas and lapacke. For differing implementations being installed at the same time, I settled for differing library names and sub-directories for the headers, so for example

/usr/lib/libopenblas.so
/usr/lib/libopenblas64.so
/usr/lib/libblas.so
/usr/lib/libcblas.so
/usr/lib/libblas64.so
/usr/lib/libcblas64.so
/usr/include/openblas/cblas.h
/usr/include/openblas64/cblas.h
/usr/include/netlib/cblas.h
/usr/include/netlib64/cblas.h
/usr/include/cblas.h -> netlib/cblas.h (for compatibility, having the default)

(and so forth)

We do not consider runtime switching like the binary distros, so it is OK if each cblas.h (and lapacke.h) is specific to its matched library, like with extra names for libopenblas. Build-time selection happens via

BLAS_INCLUDES=-I/prefix/include/netlib64
BLAS_LIBS=-lblas64
CBLAS_LIBS=-lcblas64

(etc.) That's what the .pc files are supposed to say, and it a lot easier than to communicate a different header file name. They are not yet consistent in that, but I am fixing it up. Seems so far people just hacked that in their distros, if bothering with all reference libs at all.

I got one question about those headers, though.

I am hacking the cmake build to get each component built separately, and am trying some other fixup (see https://github.com/Reference-LAPACK/lapack/pull/556). I get the libblas.so and libblas64.so libraries built fine, I get the header dirs configured … but the installed cblas.h and lapacke.h are identical for the 32 and 64 bit indexing versions. This is at odds with openblas: There, I got a crucial difference that I do not see for the netlib builds:

diff -ruN /data/pkg/include/openblas/openblas_config.h /data/pkg/include/openblas64/openblas_config.h
--- /data/pkg/include/openblas/openblas_config.h	2021-06-03 19:03:53.000000000 +0200
+++ /data/pkg/include/openblas64/openblas_config.h	2021-06-03 19:13:36.000000000 +0200
@@ -44,6 +44,7 @@
 #define OPENBLAS_DLOCAL_BUFFER_SIZE 32768
 #define OPENBLAS_CLOCAL_BUFFER_SIZE 16384
 #define OPENBLAS_ZLOCAL_BUFFER_SIZE 12288
+#define OPENBLAS_USE64BITINT 
 #define OPENBLAS_GEMM_MULTITHREAD_THRESHOLD 4
 #define OPENBLAS_VERSION " OpenBLAS 0.3.15 "
 /*This is only for "make install" target.*/

For the reference libraries, all headers from the 32 and 64 bit index builds are identical and apparently users are expected to put -DWeirdNEC in their flags (might have been funny 30 years ago) for cblas.h and -DLAPACK_ILP64 -DHAVE_LAPACK_CONFIG_H. Since people use the optimized BLAS libraries in production, the de facto standard is not to expose that to the users. These feed back on the reference, IMHO, and the headers installed from an ILP64 build should not require funky flags to avoid crashing your app when linking to the 64 bit lib.

Do we agree that it is the right solution to modify the headers at build time to define the correct integers?

Btw, the cblas config files that are installed also miss any reference to the necessary defs, so are broken for the 64 bit index builds, as it seems. But actually, I ponder not installing these at all. They are redundant with the .pc files and make it possibly harder to convince cmake-using dependent packages to accept a packager choice via BLAS_LIBS etal.

Jun 03 '21 17:06 drhpc

PS: With Intel MKL, there is a central switch -DMKL_ILP64 to be set. I imagine setting up trivial include/intel-mkl64/cblas.h with

#ifndef MKL_ILP64
#define MKL_ILP64
#endif
#include <mkl_cblas.h>

to fit the general scheme. I could also put the define into BLAS_INCLUDES, same for the weird netlib defines. What is better? Do we want to do it like Intel or like OpenBLAS?

Jun 03 '21 17:06 drhpc

Do we agree that it is the right solution to modify the headers at build time to define the correct integers?

Yes. I agree with that, and prefer the solution that does not replicate the entire header. I think it is cleaner.

Btw, the cblas config files that are installed also miss any reference to the necessary defs, so are broken for the 64 bit index builds, as it seems.

Right. I just installed the 64-bits libraries (BUILD_INDEX64=ON) and couldn't see anything telling me to use WeirdNEC, LAPACK_ILP64 or HAVE_LAPACK_CONFIG_H. Thanks for noticing that!

Jun 07 '21 21:06 weslleyspereira

Yes. I agree with that, and prefer the solution that does not replicate the entire header. I think it is cleaner.

This is ambiguous to me. Which is the cleaner solution? What I am preparing now is such:

#if defined(WeirdNEC) || @HAVE_ILP64@
   #define CBLAS_INDEX long
   #ifndef WeirdNEC
   #define WeirdNEC
   #endif
#else
   #define CBLAS_INDEX int
#endif

The CMakeFile shall replace HAVE_ILP with 1 or 0, the resulting header being installed for the current build.

(Btw.: long wouldn't work on Windows. It's long long there … or int64_t on all platforms with stdint.)

Right. I just installed the 64-bits libraries (BUILD_INDEX64=ON) and couldn't see anything telling me to use WeirdNEC, LAPACK_ILP64 or HAVE_LAPACK_CONFIG_H. Thanks for noticing that!

I am imagining a future where you do

cc -I/foo/include/netlib64 -o bar bar.c -L/foo/lib -lcblas64

And things are handled in foo/include/netlib64/cblas.h, otherwise by foo/include/netlib/cblas.h (possibly linked to foo/include/cblas.h).

I have the suspicion that this is not what you meant, but I want to convince that it is better;-)

You could try to not to duplicate the header by placing 'the' header in /foo/include/cblas.h and have /foo/include/netlib64/cblas.h include that one only by defining WeirdNEC, but that means that the 64 bit and 32 bit packages share that common header file, which is messy for packaging. It is way better if each puts its file into separate places/names. The name needs to stay cblas.h because you don't want to go around replacing #include <cblas.h> lines.

Edit: Also, having cblas.h include ../cblas.h is messy by itself. Also we define one header installation directory for cmake. By default that's /foo/include, not /foo/netlib64/include. I am not going to change this default. Packagers will have to specify the subdirectory like this (BSD make in pkgsrc):

.if !empty(LAPACK_COMPONENT:M*64)
.  if empty(MACHINE_ARCH:M*64)
PKG_FAIL_REASON+=       "${LAPACK_COMPONENT} incompatible with non-64-bit platform"
.  endif
HEADERDIR=netlib64
.else
HEADERDIR=netlib
.endif

# Note: We patch the build to install both static and
# shared libraries.
CMAKE_ARGS=     -DBUILD_DEPRECATED=ON \
                -DBUILD_SHARED_LIBS=ON \
                -DBUILD_STATIC_LIBS=ON \
                -DCMAKE_INSTALL_INCLUDEDIR=${PREFIX}/include/${HEADERDIR} \
                ${LAPACK_COMPONENT_CMAKE_ARGS}

Jun 08 '21 00:06 drhpc

A beautiful aspect of shipping/installing the 32 bit cblas.h with this modification to the usual location is that the original mechanics still work. Only the 64 bit variant will enforce WeirdNEC. You could decide to only install the 64 bit one into a prefix and keep the other parts of the ecosystem untouched.

Jun 08 '21 00:06 drhpc

Oh, come on … the CBLAS/cmake/cblas-config-install.cmake.in seems to forget -DCMAKE_INSTALL_INCLUDEDIR, doesn't it?

# Report lapacke header search locations.
set(CBLAS_INCLUDE_DIRS ${_CBLAS_PREFIX}/include)

(The comment is sugar on top.)

I have the feeling that the CMake build is a lot less mature at one would think. Is the project serious about having that as primary build or is this just a drive-by contribution? I am really tempted to rather fix up the old-style Makefile, less fuss all around. But I now sank so much time into fixing up the CMake stuff, that I detest anyway. So I would like to get it over with.

Jun 08 '21 00:06 drhpc

I have to give up now … I managed to move cblas.h to cblas.h.in as indicated above, and added

configure_file(${CMAKE_CURRENT_SOURCE_DIR}/cblas.h.in cblas.h @ONLY)
configure_file(${CMAKE_CURRENT_SOURCE_DIR}/cblas_f77.h.in cblas_f77.h @ONLY)

to CBLAS/include/CMakeLists.txt, also having defined @HAVE_ILP64@ to 1 or 0 in the toplevel CMakeLists.txt. But I can for the life of me not figure out how to make the install stuff which is in a higher-level CMakeLists.txt install the generated headers, or the weird copy of the same from ${LAPACK_BINARY_DIR}/include (really? A copy within the source tree?)

What is the macro append_subdir_files supposed to do? It seems to prepend a copy of the prefix to the header paths. I got not enough or too much path to the source header files. I just want to install the header files from HERE to THERE, dammit.

Can someone knowledgeable help out here? I guess I could figure it out tomorrow, but I am not sure if that is without smashing something in the real world for emotional relief.

Jun 08 '21 02:06 drhpc

Yes. I agree with that, and prefer the solution that does not replicate the entire header. I think it is cleaner.

This is ambiguous to me. Which is the cleaner solution? What I am preparing now is such:
#if defined(WeirdNEC) || @HAVE_ILP64@
   #define CBLAS_INDEX long
   #ifndef WeirdNEC
   #define WeirdNEC
   #endif
#else
   #define CBLAS_INDEX int
#endif
The CMakeFile shall replace HAVE_ILP with 1 or 0, the resulting header being installed for the current build.

(Btw.: long wouldn't work on Windows. It's long long there … or int64_t on all platforms with stdint.)

Right. I just installed the 64-bits libraries (BUILD_INDEX64=ON) and couldn't see anything telling me to use WeirdNEC, LAPACK_ILP64 or HAVE_LAPACK_CONFIG_H. Thanks for noticing that!

I am imagining a future where you do
cc -I/foo/include/netlib64 -o bar bar.c -L/foo/lib -lcblas64
And things are handled in foo/include/netlib64/cblas.h, otherwise by foo/include/netlib/cblas.h (possibly linked to foo/include/cblas.h).

I have the suspicion that this is not what you meant, but I want to convince that it is better;-)

Sorry, let me explain. At first, I liked the idea of keeping the original header cblas.h and creating include/netlib64/cblas.h and include/netlib/cblas.h with something like

#if defined(WeirdNEC)
   #define WeirdNEC
#endif
#include <cblas.h>

You could try to not to duplicate the header by placing 'the' header in /foo/include/cblas.h and have /foo/include/netlib64/cblas.h include that one only by defining WeirdNEC, but that means that the 64 bit and 32 bit packages share that common header file, which is messy for packaging. It is way better if each puts its file into separate places/names. The name needs to stay cblas.h because you don't want to go around replacing #include <cblas.h> lines.

Edit: Also, having cblas.h include ../cblas.h is messy by itself. Also we define one header installation directory for cmake.

but yes, we would have to use include/netlib64 and include in as include dirs if we one header includes the other.

By default that's /foo/include, not /foo/netlib64/include. I am not going to change this default. Packagers will have to specify the subdirectory like this (BSD make in pkgsrc):

.if !empty(LAPACK_COMPONENT:M*64)
.  if empty(MACHINE_ARCH:M*64)
PKG_FAIL_REASON+=       "${LAPACK_COMPONENT} incompatible with non-64-bit platform"
.  endif
HEADERDIR=netlib64
.else
HEADERDIR=netlib
.endif

# Note: We patch the build to install both static and
# shared libraries.
CMAKE_ARGS=     -DBUILD_DEPRECATED=ON \
                -DBUILD_SHARED_LIBS=ON \
                -DBUILD_STATIC_LIBS=ON \
                -DCMAKE_INSTALL_INCLUDEDIR=${PREFIX}/include/${HEADERDIR} \
                ${LAPACK_COMPONENT_CMAKE_ARGS}

That seems good to me. So, you would only add an alternative to build LAPACK without having to guess the compiler flags. But the current way would also work.

(Btw.: long wouldn't work on Windows. It's long long there … or int64_t on all platforms with stdint.)

Good to know. BLAS++ and LAPACK++ use int64_t instead of long long.

Jun 08 '21 12:06 weslleyspereira

@weslleyspereira So you at first liked this idea:

#if defined(WeirdNEC)
   #define WeirdNEC
#endif
#include "../cblas.h"

with /prefix/include/cblas.h and /prefix/include/netlib64/cblas.h, the latter locating the former? But you do agree now that it is a more robust solution to install a header that looks like this for a 64 bit build?

#if defined(WeirdNEC) || @HAVE_ILP64@
   #define CBLAS_INDEX long
   #ifndef WeirdNEC
   #define WeirdNEC
   #endif
#else
   #define CBLAS_INDEX int
#endif

(long vs. int64 is a different matter, but I am all for doing that change, just like BLAS++)

Heck, I am not even sure if it is safe to assume that `#include ".../cblas.h" will find only the other indented header. The C standard seems to say that search order is implementation-defined, not necessarily relative to the current header. My main issue as packager is that I'd need a separate package for that common header or have the 64 bit package depend on the 32 bit one just for that. This would suck.

I would really like to go forward now with such a change for pkgsrc, to settle a change for the upstream code later. We could discuss a new symbol to force 32 bit indices or 64 bit indices explicitly with any of the headers (-DNETLIB_INDEX_BITS=64 ?), just defaulting to what the library was built with.

Can I get agreement about our intended solution being this?

lib/libcblas64.so
include/optional_subdir64/cblas.h

and

lib/libcblas.so
include/optional_subdir/cblas.h

Each build of LAPACK code results in headers that, at least by default, match the installed libraries without the user defining anything. OK?

I then could slip this in before the upcoming release of pkgsrc (deadline nearing) and we can further discuss the details of that implementation so that I can drop the patches after merging something here, with a new LAPACK release. With this change, the plain Makefile build also needs fixing, but I don't need that for my patches yet when I just use the CMake build.

(Just need to somehow check my temper when trying to beat that weird CMake build into submission, where it shuffles header copies around the build directories and then cannot find them for install. Or decide about those broken .cmake files having any use for us, maybe just drop them from install … we got pkg-config!)

Jun 09 '21 08:06 drhpc

Anything? I must admit that I don't see much chance for a different solution in practice, as this is the example set by openblas, the main implementation we use. I can imagine convincing Intel to have subdirectory for 64 bit/32 bit index headers, too, wrapping over their mkl_cblas.h and mkl_lapacke.h. Otherwise I build a simple package that just provides those.

include/mkl-blas/cblas.h
include/mkl-blas64/cblas.h

Currently, I added machinery to pkgsrc to provide builds with the funny -DWeirdNEC -DHAVE_LAPACK_CONFIG_H -DLAPACK_ILP64 line, with both cblas and cblas64 installing identical headers. It could stay that way, but I still think it makes sense to have the header set up to match the build ABI.

Jun 10 '21 18:06 drhpc

@weslleyspereira So you at first liked this idea:
#if defined(WeirdNEC)
   #define WeirdNEC
#endif
#include "../cblas.h"
with /prefix/include/cblas.h and /prefix/include/netlib64/cblas.h, the latter locating the former? But you do agree now that it is a more robust solution to install a header that looks like this for a 64 bit build?
#if defined(WeirdNEC) || @HAVE_ILP64@
   #define CBLAS_INDEX long
   #ifndef WeirdNEC
   #define WeirdNEC
   #endif
#else
   #define CBLAS_INDEX int
#endif

Yes, that's it. I agree with your solution of having subfolders for the 32- and 64-bits headers. I discussed this with @langou, and he was also convinced this would be a good solution.

(long vs. int64 is a different matter, but I am all for doing that change, just like BLAS++)

Right. This should be addressed in another issue.

I would really like to go forward now with such a change for pkgsrc, to settle a change for the upstream code later. We could discuss a new symbol to force 32 bit indices or 64 bit indices explicitly with any of the headers (-DNETLIB_INDEX_BITS=64 ?), just defaulting to what the library was built with.

Can I get agreement about our intended solution being this?
lib/libcblas64.so
include/optional_subdir64/cblas.h
and
lib/libcblas.so
include/optional_subdir/cblas.h

Yes. I think you can go forward and propose a PR in the future, thanks! I personally think a new symbol like NETLIB_INDEX_BITS makes total sense. I would just make sure the default value remains 32, and that -DWeirdNEC implies -DNETLIB_INDEX_BITS=64.

Each build of LAPACK code results in headers that, at least by default, match the installed libraries without the user defining anything. OK?

Sounds good to me.

I then could slip this in before the upcoming release of pkgsrc (deadline nearing) and we can further discuss the details of that implementation so that I can drop the patches after merging something here, with a new LAPACK release. With this change, the plain Makefile build also needs fixing, but I don't need that for my patches yet when I just use the CMake build.

Ok! We will probably have a LAPACK release in the second semester of 2021. And yes, the Makefile should be adjusted accordingly, and I am willing to help with that.

Jun 11 '21 14:06 weslleyspereira

This is somewhat related. We should not forget that the headers for netlib CBLAS are not only provided by netlib … NumPy always uses its own header:

https://github.com/numpy/numpy/blob/main/numpy/core/src/common/npy_cblas.h

And in this header, it sets CBLAS_INDEX=size_t, different from the integer type used in specifying indices. It is used solely for the return values of some functions:

$ grep CBLAS_INDEX ./numpy/core/src/common/npy_cblas_base.h                                                                                                                                  
CBLAS_INDEX BLASNAME(cblas_isamax)(const BLASINT N, const float  *X, const BLASINT incX);
CBLAS_INDEX BLASNAME(cblas_idamax)(const BLASINT N, const double *X, const BLASINT incX);
CBLAS_INDEX BLASNAME(cblas_icamax)(const BLASINT N, const void   *X, const BLASINT incX);
CBLAS_INDEX BLASNAME(cblas_izamax)(const BLASINT N, const void   *X, const BLASINT incX);

The difference:

$ grep cblas_isamax ./numpy/core/src/common/npy_cblas_base.h  /data/pkg/include/cblas.h                                                                                                      
./numpy/core/src/common/npy_cblas_base.h:CBLAS_INDEX BLASNAME(cblas_isamax)(const BLASINT N, const float  *X, const BLASINT incX);
/data/pkg/include/cblas.h:CBLAS_INDEX cblas_isamax(const CBLAS_INDEX N, const float  *X, const CBLAS_INDEX incX);

I wonder if that's possibly causing trouble. For Netlib, there is only one type of index, while other implementations use a differing return value type for the index functions. OpenBLAS sets the example. They say isamax returns unsigned size_t, but the C wrapper actually calls a Fortran function that returns a signed integer (Edit: A subroutine that writes a signed 32 or 64 bit integer value to a handed-in reference to an unsigned 64 bit variable on 64 bit systems).

Does the reference implementation have an opinion about this? I guess there is no real trouble, as a size_t value will always be able to hold any non-negative return from isamax(). But it smells iffy. (Edit: You could build with 64 bit indices on a 32 bit system where size_t is 32 bits, right? Then you got overflow. In additon to the uneasyness of casting size_t * to int *.)

Since optmized implementations seem to have decided on size_t there, should the reference accept that fact and follow?

Jun 12 '21 07:06 drhpc

And how dangerous is it, actually, to link numpy with reference cblas?

Jun 12 '21 07:06 drhpc

OpenBLAS sets the example. (...) Since optmized implementations seem to have decided on size_t there, should the reference accept that fact and follow?

I certainly cannot speak for numpy (or mkl etc. for that matter), but I'd hesitate to claim OpenBLAS to be normative in any form, least of all relative to what is (I believe) generally seen as the reference implementation...

Jun 12 '21 16:06 martin-frbg

Sure. It's just that OpenBLAS or MKL are what people use in practice and both seem to have settled on

#define CBLAS_INDEX size_t  /* this may vary between platforms */
#ifdef MKL_ILP64
#define MKL_INT MKL_INT64
#else
#define MKL_INT int
#endif
CBLAS_INDEX cblas_isamax(const MKL_INT N, const float  *X, const MKL_INT incX);

or similarily

#ifdef OPENBLAS_USE64BITINT
typedef BLASLONG blasint;
#else
typedef int blasint;
#endif
#define CBLAS_INDEX size_t
CBLAS_INDEX cblas_isamax(OPENBLAS_CONST blasint n, OPENBLAS_CONST float  *x, OPENBLAS_CONST blasint incx);

vs. the reference

#ifdef WeirdNEC
   #define CBLAS_INDEX long
#else
   #define CBLAS_INDEX int
#endif
CBLAS_INDEX cblas_isamax(const CBLAS_INDEX N, const float  *X, const CBLAS_INDEX incX);

How come that they diverge from the reference here? Was there communication about that? Also … I see MKL and OpenBLAS defining a host of functions that aren't even part of reference CBLAS:

CBLAS_INDEX cblas_isamin(const MKL_INT N, const float  *X, const MKL_INT incX);
CBLAS_INDEX cblas_idamin(const MKL_INT N, const double *X, const MKL_INT incX);
CBLAS_INDEX cblas_icamin(const MKL_INT N, const void   *X, const MKL_INT incX);
CBLAS_INDEX cblas_izamin(const MKL_INT N, const void   *X, const MKL_INT incX);

CBLAS_INDEX cblas_isamin(OPENBLAS_CONST blasint n, OPENBLAS_CONST float  *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_idamin(OPENBLAS_CONST blasint n, OPENBLAS_CONST double *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_icamin(OPENBLAS_CONST blasint n, OPENBLAS_CONST void  *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_izamin(OPENBLAS_CONST blasint n, OPENBLAS_CONST void *x, OPENBLAS_CONST blasint incx);

CBLAS_INDEX cblas_ismax(OPENBLAS_CONST blasint n, OPENBLAS_CONST float  *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_idmax(OPENBLAS_CONST blasint n, OPENBLAS_CONST double *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_icmax(OPENBLAS_CONST blasint n, OPENBLAS_CONST void  *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_izmax(OPENBLAS_CONST blasint n, OPENBLAS_CONST void *x, OPENBLAS_CONST blasint incx);

CBLAS_INDEX cblas_ismin(OPENBLAS_CONST blasint n, OPENBLAS_CONST float  *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_idmin(OPENBLAS_CONST blasint n, OPENBLAS_CONST double *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_icmin(OPENBLAS_CONST blasint n, OPENBLAS_CONST void  *x, OPENBLAS_CONST blasint incx);
CBLAS_INDEX cblas_izmin(OPENBLAS_CONST blasint n, OPENBLAS_CONST void *x, OPENBLAS_CONST blasint incx);

So, extending the standard is one thing, but size_t vs int seems to be a serious issue on 64 bit systems. This should be settled some way. It seems to me that the Netlib way is sensible: Same type as that which is used for indices. As all call Fortran routines like this in the end

c     isamaxsub.f
c
c     The program is a fortran wrapper for isamax.
c     Witten by Keita Teranishi.  2/11/1998
c
      subroutine isamaxsub(n,x,incx,iamax)
c
      external isamax
      integer  isamax,iamax
      integer n,incx
      real x(*)
c
      iamax=isamax(n,x,incx)
      return
      end

… handing in an address of size_t for iamax, that just seems wrong. I didn't find another implementation than this reference one in the OpenBLAS sources. Are they just stupid changing the external type like that or am I overlooking something very basic? Is anyone actually using these functions?

Jun 12 '21 17:06 drhpc

Hi all, Reference BLAS, reference CBLAS, reference LAPACK, two of the main thrusts of these projects are (1) numerical algorithms and (2) defining common interfaces, a reference implementation and a test suite that goes this. I think everyone involved in these projects is happy to look and learn from other projects (OpenBLAS, MKL, etc.) about software engineering, best practice for deploying the software, etc. We have a lot to learn from these projects. (And we also learn a lot from other numerical linear algebra projects!) Anyhow: reference BLAS, CBLAS, LAPACK can use some improvement in its CMake packaging, interfaces, and if OpenBLAS (e.g.) has a better process, that is well suited for us, well, I am all in favor moving toward this model.

Jun 12 '21 17:06 langou

To add some context, the CBLAS was born from a committee (the Basic Linear Algebra Subprograms Technical Forum) that worked from 1996 to 2000 on revisiting the BLAS, as part of this they defined a C interface for the BLAS. See: http://www.netlib.org/blas/blast-forum/ In particular see: http://www.netlib.org/blas/blast-forum/cinterface.pdf I believe that the CBLAS offered by LAPACK is an implementation of the interface as defined by the Basic Linear Algebra Subprograms Technical Forum 25 years ago.

If there are suggestions to improve CBLAS, send them along. I can try to pass this to the various stakeholders.

Jun 12 '21 19:06 langou

Thanks for the pointer. So the relevant part seems to be B.2.2 in that spec, which says that BLAS_INDEX usually is size_t, but might also be chosen to be identical to the (signed) Fortran integer type used for indexing. It's up to the implementation.

So it seems that popular optimized implementations chose size_t and the Netlib reference chose the same integer it uses for Fortran. I see copies of cblas.h all around in various projects that use the lib (like numpy, shipping a header for an external lib), with that line

#define CBLAS_INDEX size_t  /* this may vary between platforms */

In https://github.com/LuaDist/gsl/blob/master/cblas/gsl_cblas.h, this is accompanied by

/* This is a copy of the CBLAS standard header.
 * We carry this around so we do not have to
 * break our model for flexible BLAS functionality.
 */

This sounds like this originated in the reference implementation, but has since changed? Looking at 41779680d1f233928b67f5f66c0b239aecb42774 … I see that the CBLAS_INDEX switch with WeirdNEC has been there before the 64 bit build. Wow, is this commit recent. Now I see that size_t was in the referenc cblas.h until 2015, 83fc0b48afd1f9a6d6f8dddb16e69ed7ed0e7242 having changed it and introduced the WeirdNEC define. I did not imagine that this is so recent! Wildly confusing.

I also see that the earlier version of cblas.h handed an int to the fortran call, now CBLAS_INDEX. This seems to be correct now, with consistent usage of CBLAS_INDEX as integer type and the switch for 32 or 64 bits in the Fortran part.

But could it be that the optimized libraries that inherited an old version of cblas.h with size_t but sync sources with current CBLAS code from the reference have a nice bug going? Aren't they doing something like this for the 32 bit case on a 64 bit system?

#include <stdio.h>
#include <stdlib.h>


void ia_max(int a, void *b)
{
	int *ib = (int*)b;
	*ib = a*2;
}


int main(int argc, char **argv)
{
	int i = atoi(argv[1]);
	size_t maxi;
	ia_max(i, &maxi);
	printf("in %d out %ld\n", i, (long)maxi);
	return 0;
}

This results in

$ gcc -O -o t t.c
$ ./t 42
in 42 out 140724603453524

Iniitalizing the size_t value to zero helps, but probably only in the little-endian case. Does nobody get into trouble for this? I have to be missing something.

To conclude:

Reference CBLAS had the size_t as return value first.
It used int in the actual call to Fortran, though.
Downstream (optimized BLAS, CBLAS users) run with the old version of the header.
Reference CBLAS introduces WeirdNEC hack for a specific system, replacing size_t with int or long (matching the Fortran side?!)
64 bit Reference CBLAS interface gets build on top of that, using CBLAS_INDEX everywhere for Fortran default integer.
Downstreams did their own thing with 64 bit support, but separating it from CBLAS_INDEX, which is always size_t.
Downstreams inherit the CBLAS wrappers that employ CBLAS_INDEX to call Fortran that expects default integer.

This sounds like a wonderful breakage as a result. Headers and code diverged. How come nobody noticed issues yet? Or did I miss the part where the reference CBLAS wrapper code for isamax and friends is not actually used?

Jun 13 '21 07:06 drhpc

OpenBLAS at least does not use the CBLAS wrapper code from Reference-LAPACK (and never did, the source is there but does not get built)

Jun 13 '21 10:06 martin-frbg

@martin-frbg Good to know. Can you point out a code path for, say, x86-64 that shows how the size_t is passed around to the actual computation for cblas_isamax()? I found some specific kernel implementation but am not sure about the general case.

Would be good to know that nobody actually passes a (size_t*) to the Fortran interface.

For sure it's not good that projects just assume

size_t cblas_isamax(…)

when the actual library might offer an int or long (or int64_t) as return value. Might work most of the time with values in 64 bit registers, but it's not nice. Can we rectify this in the implementations? People did not pick up on the example of Netlib in the past 5 years about consinstent use of CBLAS_INDEX.

Jun 13 '21 12:06 drhpc

relevant code is in OpenBLAS/interface, e.g. interface/imax.c gets compiled to cblas_isamax() when CBLAS is defined, no Fortran code involved in its call graph.

Jun 13 '21 12:06 martin-frbg

Ah, good. So the one case which is actually problematic is depending projects using a copy of cblas.h that does not fit the library.

I don't find actual usage of cblas_isamax() and friends in NumPy (and SciPy), so this might be only a theoretical issue. It should be fixed nonetheless. So:

Others follow the Netlib example of using int32_t/int64_t (let's be explicit while at that;-) BLAS_INDEX for both size returns and index arguments.
Netlib caves in and reverts to size_t for those returns like others.

Is this a separate issue to discuss? It does releate to the choice of 32 or 64 bit library, though.

PS: I am still unsure if enums in the API are a good idea (as actual data type for function arguments and struct members), as there are compiler options to change what integer is used underneath them. Not that relevant in practice, but makes me uneasy nevertheless.

Jun 13 '21 21:06 drhpc

lapack lapack copied to clipboard

Allow a possible installation of index-64 library alongside standard index-32 library?

lapack
lapack copied to clipboard