concurrencpp icon indicating copy to clipboard operation
concurrencpp copied to clipboard

GCC 11 and GCC 12 support

Open chausner opened this issue 3 years ago • 8 comments

This PR provides the following improvements:

  • adds fixes and tweaks to make the library compile with GCC 11 and GCC 12
  • builds and runs tests using GCC 11 and GCC 12 in GitHub CI
  • adds support for ThreadSanitizer with GCC

chausner avatar Sep 05 '21 12:09 chausner

@David-Haim I see you added a custom binary_semaphore implementation here: https://github.com/David-Haim/concurrencpp/pull/47

With that, maybe even GCC 10 would then be supported because I think the only error I got when trying GCC 10 was the missing implementation of std::binary_semaphore.

chausner avatar Sep 05 '21 12:09 chausner

@David-Haim I see you added a custom binary_semaphore implementation here: #47

With that, maybe even GCC 10 would then be supported because I think the only error I got when trying GCC 10 was the missing implementation of std::binary_semaphore.

I tried to build with GCC 10 but seems like the coroutine implementation is buggy:

[ 13%] Building CXX object CMakeFiles/shared_result_resolving_tests.dir/source/tests/result_tests/shared_result_resolve_tests.cpp.o In file included from /mnt/c/Users/chris/Source/Repos/concurrencpp/include/concurrencpp/concurrencpp.h:14, from /mnt/c/Users/chris/Source/Repos/concurrencpp/test/source/tests/result_tests/shared_result_resolve_tests.cpp:1: /mnt/c/Users/chris/Source/Repos/concurrencpp/include/concurrencpp/results/shared_result.h: In instantiation of ‘static concurrencpp::shared_result concurrencpp::shared_result::make_shared_result(concurrencpp::details::shared_result_tag, concurrencpp::result) [with type = int]’: /mnt/c/Users/chris/Source/Repos/concurrencpp/include/concurrencpp/results/shared_result.h:36:39: required from ‘concurrencpp::shared_result::shared_result(concurrencpp::result) [with type = int]’ /mnt/c/Users/chris/Source/Repos/concurrencpp/test/source/tests/result_tests/shared_result_resolve_tests.cpp:173:29: required from ‘void concurrencpp::tests::test_shared_result_resolve_impl() [with type = int]’ /mnt/c/Users/chris/Source/Repos/concurrencpp/test/source/tests/result_tests/shared_result_resolve_tests.cpp:192:42: required from here /mnt/c/Users/chris/Source/Repos/concurrencpp/include/concurrencpp/results/shared_result.h:17:9: internal compiler error: Segmentation fault 17 | } | ^ Please submit a full bug report, with preprocessed source if appropriate. See file:///usr/share/doc/gcc-10/README.Bugs for instructions.

The problem does not occur with GCC 11.

chausner avatar Sep 05 '21 21:09 chausner

Some tests seem to fail. Need to investigate why:

Test project /mnt/c/Users/chris/Source/Repos/concurrencpp/build/test Start 1: task_tests 1/23 Test #1: task_tests ....................... Passed 0.01 sec Start 2: runtime_tests 2/23 Test #2: runtime_tests .................... Passed 0.01 sec Start 3: inline_executor_tests 3/23 Test #3: inline_executor_tests ............ Passed 0.02 sec Start 4: manual_executor_tests 4/23 Test #4: manual_executor_tests ............ Passed 8.55 sec Start 5: thread_executor_tests 5/23 Test #5: thread_executor_tests ............ Passed 0.20 sec Start 6: thread_pool_executor_tests 6/23 Test #6: thread_pool_executor_tests .......Child abortedException: 1.16 sec Start 7: worker_thread_executor_tests 7/23 Test #7: worker_thread_executor_tests ..... Passed 0.13 sec Start 8: result_tests 8/23 Test #8: result_tests ..................... Passed 8.31 sec Start 9: result_resolving_tests 9/23 Test #9: result_resolving_tests ...........Child abortedException: 0.02 sec Start 10: result_awaiting_tests 10/23 Test #10: result_awaiting_tests ............Child abortedException: 0.02 sec Start 11: lazy_result_tests 11/23 Test #11: lazy_result_tests ................ Passed 0.02 sec Start 12: shared_result_tests 12/23 Test #12: shared_result_tests .............. Passed 8.82 sec Start 13: shared_result_resolving_tests 13/23 Test #13: shared_result_resolving_tests ....Child abortedException: 0.02 sec Start 14: shared_result_awaiting_tests 14/23 Test #14: shared_result_awaiting_tests .....Child aborted***Exception: 0.02 sec Start 15: make_result_tests 15/23 Test #15: make_result_tests ................ Passed 0.01 sec Start 16: result_promise_tests 16/23 Test #16: result_promise_tests ............. Passed 0.02 sec Start 17: when_all_tests 17/23 Test #17: when_all_tests ................... Passed 0.26 sec Start 18: when_any_tests 18/23 Test #18: when_any_tests ................... Passed 0.62 sec Start 19: resume_on_tests 19/23 Test #19: resume_on_tests .................. Passed 240.02 sec Start 20: coroutine_promise_tests 20/23 Test #20: coroutine_promise_tests .......... Passed 0.02 sec Start 21: coroutine_tests 21/23 Test #21: coroutine_tests .................. Passed 0.25 sec Start 22: timer_queue_tests 22/23 Test #22: timer_queue_tests ................ Passed 0.77 sec Start 23: timer_tests 23/23 Test #23: timer_tests ...................... Passed 341.91 sec

78% tests passed, 5 tests failed out of 23

Total Test time (real) = 611.34 sec

The following tests FAILED: 6 - thread_pool_executor_tests (Child aborted) 9 - result_resolving_tests (Child aborted) 10 - result_awaiting_tests (Child aborted) 13 - shared_result_resolving_tests (Child aborted) 14 - shared_result_awaiting_tests (Child aborted) Errors while running CTest

chausner avatar Sep 05 '21 21:09 chausner

Hi there. Thanks for the contribution and effort, concurrencpp will greatly benefit from it.

I did try to develop support for GCC a couple of weeks ago and encountered the same fails that you see in your branch. From what I've seen debugging those problems, it seems to me that GCC still produces buggy assembly code for coroutines, especially for eager coroutines.

I'm not unfamiliar with problems like this. Clang claimed to support coroutines since clang-6 but it's only since clang-11 that most big bugs were fixed and clang coroutine support became production ready.

I will try to debug it again (I'm a bit busy in the next couple of days), if some miracle happens and those fails can be easily fixed - so be it. If it's something in the compiler itself, I think we need to wait with GCC until they mature their coroutine support.

David-Haim avatar Sep 07 '21 09:09 David-Haim

From what I've seen debugging those problems, it seems to me that GCC still produces buggy assembly code for coroutines, especially for eager coroutines.

I see. Yes, I have also ran into coroutine-related compiler bugs with older GCC versions already and I had hoped the implementation would be more mature in GCC 11. :/

I think we should submit the failing code to GCC developers so that they are aware of the problems.

chausner avatar Sep 07 '21 17:09 chausner

Just tested with a recent GCC 12.0.1 20220319 build and the tests still fail with the same errors.

chausner avatar Apr 20 '22 22:04 chausner

No luck with GCC 12.2.0 either. Same tests are still failing..

chausner avatar Sep 11 '22 23:09 chausner

I had a closer look at the tests that were failing for GCC-11 and GCC-12 and found a workaround: 2e2fea89caa0569634d11058c7019c03b6f4708b

It seems the compilers get confused with lambda captures and coroutines. Luckily, the affected code is just in the tests and can be easily worked around by capturing variables by reference, which should be safe in the context where those functions are called.

With those fixes applied, the test suite completes successfully for GCC 11.2.0 and GCC 12.2.0, so I am marking this MR as ready.

chausner avatar Sep 12 '22 01:09 chausner

here is the question: If we know GCC still produces a buggy assembly for coroutines, can we honestly say that this coroutine library is supported by it? I think it kinda suggests that this library can be used with GCC as production ready, and this is not the case, at least not from the compiler side. developers will naturally use lambda captures (by value) as it's legit part of the language.

I suggest we open them a bug detailing how GCC mistreat lambda captures, what do you say?

David-Haim avatar Sep 24 '22 09:09 David-Haim

If we know GCC still produces a buggy assembly for coroutines, can we honestly say that this coroutine library is supported by it? I think it kinda suggests that this library can be used with GCC as production ready, and this is not the case, at least not from the compiler side. developers will naturally use lambda captures (by value) as it's legit part of the language.

That's a valid point, of course. How about we merge this and add a disclaimer to the README at the location where we claim GCC support and link to the issue? Then it's up to the user to decide if they want to accept this risk or not. The fact that the test suite runs through in GCC shows that it is possible to use the library with GCC today. I think there is no harm in merging this MR at this point.

I suggest we open them a bug detailing how GCC mistreat lambda captures, what do you say?

Yes, we should definitely do that. However, I haven't looked deeply enough into the issue so far to understand what the exact problem is and how to reduce it to a small test case for reproduction.

chausner avatar Sep 24 '22 09:09 chausner

The fact that the test suite runs through in GCC shows that it is possible to use the library with GCC today

Actually, let me check this again, I think I saw some CI runs where gcc with TSAN threw runtime errors.

chausner avatar Sep 24 '22 09:09 chausner

How about we merge this and add a disclaimer to the README at the location where we claim GCC support and link to the issue? Then it's up to the user to decide if they want to accept this risk or not.

I added such a disclaimer to the README: https://github.com/David-Haim/concurrencpp/pull/51/commits/1cba066d5e5c47bba31e6bc4501d2d0b272894fb

chausner avatar Sep 24 '22 09:09 chausner

I suggest we open them a bug detailing how GCC mistreat lambda captures, what do you say?

I further looked into this issue and was able to narrow it down. The bug manifests itself in statements like

co_await thread_executor->submit([this, manual_executor] {

where the lambda callable object (and thus the captured objects) are incorrectly trivially copied, i.e. the copy constructor is not called. When smart pointers are captured, it leads to incorrect reference counts/double-frees.

One workaround is to store the awaitable object in a variable before awaiting it:

auto &&result = thread_executor->submit([this, manual_executor] { /* ... */ };
co_await result;

I am using now this workaround instead of my initial approach: 6cc09bc51a608b305c5d5a432bd981e4057bcd1e

Turns out this issue has already been reported numerous times but so far nobody on the compiler team appears to be working on a fix:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98401 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99576 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100611 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101976 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104872 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105804

chausner avatar Sep 24 '22 18:09 chausner

I would like as well GCC to be part of the lib, even if it will have a mark for experimental and points to the problem.

Trafo avatar Dec 05 '22 08:12 Trafo

There is some hope that https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99576#c16 will fix this issue in gcc. The fix is currently in master.

chausner avatar Dec 05 '22 19:12 chausner

Good news, I just tested the current HEAD version of gcc and it compiles and runs all tests successfully, including the TSAN tests, without needing the GCC 11/12 workarounds.

@David-Haim Do you continue to prefer to wait until GCC 13 is out?

chausner avatar Jan 02 '23 21:01 chausner

This is great news! Good job! yes, we will wait for the next official GCC version to be released before we announce GCC support. But non the less this is amazing and will make the library be supported by all mainstream compilers. Fantastic!

David-Haim avatar Jan 03 '23 10:01 David-Haim

let's retry with gcc 13?

David-Haim avatar Apr 27 '23 08:04 David-Haim

let's retry with gcc 13?

Having some problems with building gcc 13 from source. Will have to wait until there's a PPA available with the stable release.

chausner avatar Apr 28 '23 22:04 chausner

Closing this one. For GCC 13 support, I have opened https://github.com/David-Haim/concurrencpp/pull/128.

chausner avatar May 14 '23 22:05 chausner