libpointmatcher
libpointmatcher copied to clipboard
Bug: build fail on Ubuntu bionic and focal on linux/arm64 when running utest while LPM is compiled with --cmake-build-type Debug
REF:
- NorLab TeamCity build server (require vpn access): Build #17
- Build log: Libpointmatcher_Bleeding_build_matrix_build_and_test_multi-arch_ci_PR_docker_images_17.log.zip
Notes:
- Might not be useful to fix, however the doc should warn user that
utest
might fail when libpointmatcher is compiled with--cmake-build-type Debug
Setup:
This command require multi-architecture docker builder set with linux/amd64 and linux/arm64 platform but note that the tests are failing on linux/arm64 only
bash build_system/lpm_crawl_libpointmatcher_build_matrix.bleeding.bash \
--repository-version-build-matrix-override latest \
--cmake-build-type-build-matrix-override Debug \
--fail-fast -- build ci_PR_arm64v8 ci_PR_amd64
Build log summary from NorLab TeamCity build server
Stacktrace (partial)
See build log in attach
Looking at the test times, we should always be running them in Release
build. For example, from the linked log:
22284.6 [ OK ] DataFilterTest.SpectralDecompositionDataPointsFilter (12328508 ms)
that is over 3 hours on a single test.
For the failed tests themselves, here's my two cents: The release
and debug
build trigger different optimization and in some cases, order of floating-point operations by the compiler, leading to different results between these two builds.
I did some tests in the linked branch, and the tests don't fail anymore when using double
precision (can be set on this line).
Note however that the tests related to IO operations start to trigger, which is caused by a different precision when saving point clouds to files.
@boxanm The cmake flag Debug is already not part of the PR to develop build matrix nor the ones for PR to master or release. Debug is only part of the bleeding edge build matrix. Also the failing tests are executed on arm64
architecture virtualization for the time being (until we have a dedicated arm64 machine) so they are 10 time slower then the amd64
ones. All tests in Debug passes on native amd64
.
That said, the faster tests are the better. However, 3 hours or 24 hours long for a test is just an inconvenient, what matter in the end is the test case ... providing it's a useful one.
Either we drop the ~3 hours long tests because we judge them irrelevant or we clearly state in the doc that the utest are not to be executed at all when compiled with other than --cmake-build-type Release
. One thing for sure is that we can't drop a test case or a build config in order to save time unless we have a fallback strategy to cover that case an other way.
I think your " two cents" take on cmake flag optimization makes a lot of sense and is worth investigating further to help us decide.
PS The reference from the linked log: #16 22284.6 [ OK ]
point to issue #16 from 2013, I guess that you wanted to refer something else?
PS The reference
from the linked log: #16 22284.6 [ OK ]
point to issue #16 from 2013, I guess that you wanted to refer something else?
Thanks, weird markdown auto-formatting happened there. I pushed another change to the linked branch, adding a precision argument to the output stream buffers. The tests are now stable on both debug and release when using double precision, so I suggest I create a PR and we can close this issue afterwards.
@boxanm You're a beast man.
PS Will take time next summer to optimize our CI workflow speed, ie: add arm64 native builder and improve our testing strategy