OpenBLAS
OpenBLAS copied to clipboard
[WIP] Create a bunch of build tests
The build system is somewhat fragile. This script does builds using a bunch of different Makefile.rule settings, to help find bugs, regressions and corner cases in the build system. The script is not intended to be run in Travis, but as a manual test, for example before creating a new release.
There are two things this script tests:
-
All builds should succeed (might add expected failures, builds that must fail)
-
Builds that are run with equivalent options should yield the same binary (tested via SHA hash of libopenblas.so)
-
[ ] Ready for review
What is missing in travis.yml ?
What is missing in travis.yml ?
I am not sure what you mean by that. AFAIK Travis only checks if the builds succeed, but has no way of checking if the resulting binary is the one that should have been built. Since most of these tests require at least 2 builds, I think they are not suitable to be run on Travis. The only one so far that should probably be added to the Travis builds is the 32-bit build.
I am not yet convinced that this is worth the trouble, but I won't stop you... If your impression that the build system as a whole is fragile comes from #1974, that bug was only introduced into the develop branch a few days ago as a tentative (and as yet unconfirmed) fix for #1947.
I have found that issue when I started working on this project. IMO the build system is large, complicated and has a lot of moving parts. Also there is a large reliance on #ifdefs, flags that get passed around various makefiles and the CPU and feature detection logic is an interacting mix of makefiles, C and Perl.
Overall the system gives me the impression that it is rather easy to break it in unexpected ways when implementing new features. I think some time ago there was an issue, where having SOME_FEATURE=<default_of_some_feature> in Makefile.rule was not equivalent to having that line commented out.
Having tests for things like this should give more confidence. Also, I want to make sure that I do not give false information regarding compile time options in #1827
Ah well, that would be #1422 (though I think I got the worst offenders already).
@martin-frbg I have found some build time quirks of the threading system, can you confirm that this the intended behavior?
- Setting NUM_THREADS = 1 is not the same as USE_THREAD = 0
- Setting USE_THREAD = 0 NUM_THREADS = 16 is not the same as USE_THREAD = 0, and also not the same as NUM_THREADS = 1
PS: When I say "setting NUM_THREADS = 1", that means the only deviation from the defaults is NUM_THREADS = 1
Correct in the sense that this is the behaviour inherited from GotoBLAS (with the primary difference being the sizing of an internal memory buffer), though it is not entirely clear if that was Kazushige Goto's intention or a flaw in his implementation (that may not have mattered as much on the x86 systems of ten years ago).
Is there any conceivable reason why anyone would want to use a build with USE_THREAD = 0 NUM_THREADS > 1 ?
Because if there is none, I would probably advocate for refusing to build with such a setting, similar to how the combination of USE_THREAD = 0 USE_OPENMP = 1 refuses to build. Or, alternatively, issue a warning and do the build with NUM_THREADS = 1. I think the fact that NUM_THREADS affects the binary in any way when threading is explicitly disabled is counter-intuitive.
See #1141 please. Even the single-threaded OpenBLAS may make use of the internal buffer - yes this is counter-intuitive but it is how the original library was coded and changing it now is a non-trivial task.
There is fixed compiled-in struct permitting 2 blas_memory_alloc()/..free() instances at any time for each of openblas (NUM_)threads for all openblas threads in fact occuring in current process. It was recently fixed at minimum 50 buffers to not hurt casual users of 1-threaded openblas. The change becomes worse than non-trivial as there is no uniform adherence to these builltin memory handling routines, sometimes it is malloc etc. It would take extreme effort to even "correctly" use same functions at all occurences.
Wouldn't it make more sense to use NUM_PARALLEL to make sure there are enough buffers for thread safety? I mean the current comment in Makefile.rule claims it is for parallel OpenMP builds only, but given its stated purpose of ensuring thread safety, I think single threaded builds could also use that. Looking at this line: https://github.com/xianyi/OpenBLAS/blob/89b60dab8ad21a0cc6320cbd9fcd603c4c4bfc81/common.h#L186 it looks like MAX_PARALLEL_NUMBER already does increase the number of buffers, as desired.
Thats old codebase, probably most of concern goes away once USE_TLS gets stable. 50 and MAX_NUMBER just wins some time to let USE_TLS stage before general use.