oneTBB
oneTBB copied to clipboard
onetbb: add ppc32 support, fix build for Darwin PPC (updated & rebased)
Description
Add a comprehensive description of proposed changes
This PR restores support for Darwin PPC: specifically, Leopard (ppc
, ppc64
, universal
), 10.6 PPC (10A190) and 10.6.8 Rosetta.
UPDATE 2023.02.19:
This should be ready, I guess. Sizes for asserts also fixed. (Alternatively, those two asserts can be disabled for Apple ppc32.)
UPDATE 2022.11.16:
I have re-done everything from scratch using the new oneTBB
version. There are substantial improvements since the last time:
- One symbol that was failing to export on
ppc32
is now fixed (so all symbols export). - Unnecessary changes to headers which @reble pointed to earlier were dropped as redundant.
- Linking flags for builds on Apple with GCC are implemented via CMake (in
GNU.cmake
). - The build now succeeds without disabling STRICT; instead, the following two flags suffice:
-Wno-unused-function
,-Wno-parentheses
. (I did not add this flags, for now, but it can be trivially done, conditioning on CMAKE_OSX_ARCHITECTURES.)
There are a couple of spots where macros for macOS version are needed and appropriate, since specific definitions are missing in 10.5 and 10.6 system headers. Those have been minimized. There is one static assert that fails on PPC – I have conditionally disabled it.
If anyone could improve on my patches, that will be greatly appreciated.
Fixes #819 - issue number(s) if exists
Building onetbb
on Darwin PPC fixed.
Type of change
Choose one or multiple, leave empty if none of the other choices apply
Add a respective label(s) to PR if you have permissions
- [x] bug fix - change that fixes an issue
- [x] new feature - change that adds functionality
- [ ] tests - change in tests
- [ ] infrastructure - change in infrastructure and CI
- [ ] documentation - documentation update
Tests
- [ ] added - required for new features and some bug fixes
- [ ] not needed
Documentation
- [ ] updated in # - add PR number
- [x] needs to be updated
- [ ] not needed Maybe just mention that Darwin PPC is again supported (as a 2-tier, or whatever it is called).
Breaks backward compatibility
- [ ] Yes
- [x] No
- [ ] Unknown It should restore backward compatibility :)
Notify the following users
List users with @
to send notifications
@vossmjp If you could help me to review this PR, it would be awesome!
Other information
My ticket earlier here: https://github.com/oneapi-src/oneTBB/issues/819
@vossmjp Could you or someone review this, please?
@reble Thank you very much for looking into this! This is greatly appreciated.
I am travelling at the moment and do not have access to my PPC machines, so I will return to this ticket by August. (I am really sorry for such a delay, but I just don’t have remote access.)
Please keep the ticket opened, and I will reply in detail on every point once I am back and can test the builds.
@reble I am gonna re-do the whole thing now, starting from just adding 32-bit symbols lists to the new version of oneTBB
and see what is required to make it build. Upon that, I will rebase this PR, with minimal changes to the sources.
@reble @vossmjp @dnmokhov I apologize for taking forever, but whenever you have time, please review the PR.
Test results (10.6 PPC, G5 Quad):
---> Testing onetbb
Executing: cd "/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_onetbb/onetbb/work/build" && /usr/bin/make test
Running tests...
/opt/local/bin/ctest --force-new-ctest-process
Test project /opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_onetbb/onetbb/work/build
Start 1: test_tick_count
1/136 Test #1: test_tick_count .......................... Passed 0.03 sec
Start 2: test_allocators
2/136 Test #2: test_allocators .......................... Passed 0.04 sec
Start 3: test_arena_priorities
3/136 Test #3: test_arena_priorities .................... Passed 0.06 sec
Start 4: test_dynamic_link
4/136 Test #4: test_dynamic_link ........................ Passed 0.02 sec
Start 5: test_collaborative_call_once
5/136 Test #5: test_collaborative_call_once ............. Passed 14.89 sec
Start 6: test_concurrent_lru_cache
6/136 Test #6: test_concurrent_lru_cache ................ Passed 0.05 sec
Start 7: test_concurrent_unordered_map
7/136 Test #7: test_concurrent_unordered_map ............Bus error***Exception: 25.01 sec
Start 8: test_concurrent_unordered_set
8/136 Test #8: test_concurrent_unordered_set ............Bus error***Exception: 19.78 sec
Start 9: test_concurrent_map
9/136 Test #9: test_concurrent_map ......................***Exception: SegFault 2.81 sec
Start 10: test_concurrent_set
10/136 Test #10: test_concurrent_set ......................Bus error***Exception: 1.44 sec
Start 11: test_concurrent_priority_queue
11/136 Test #11: test_concurrent_priority_queue ........... Passed 0.53 sec
Start 12: test_partitioner
12/136 Test #12: test_partitioner ......................... Passed 0.14 sec
Start 13: test_parallel_for
13/136 Test #13: test_parallel_for ........................ Passed 20.08 sec
Start 14: test_parallel_for_each
14/136 Test #14: test_parallel_for_each ................... Passed 1.87 sec
Start 15: test_parallel_reduce
15/136 Test #15: test_parallel_reduce ..................... Passed 9.62 sec
Start 16: test_parallel_sort
16/136 Test #16: test_parallel_sort ....................... Passed 13.60 sec
Start 17: test_parallel_invoke
17/136 Test #17: test_parallel_invoke ..................... Passed 12.12 sec
Start 18: test_parallel_scan
18/136 Test #18: test_parallel_scan ....................... Passed 7.41 sec
Start 19: test_parallel_pipeline
19/136 Test #19: test_parallel_pipeline ................... Passed 23.27 sec
Start 20: test_eh_algorithms
20/136 Test #20: test_eh_algorithms ....................... Passed 30.41 sec
Start 21: test_blocked_range
21/136 Test #21: test_blocked_range ....................... Passed 0.05 sec
Start 22: test_concurrent_vector
22/136 Test #22: test_concurrent_vector ................... Passed 1.54 sec
Start 23: test_task_group
23/136 Test #23: test_task_group ..........................***Exception: SegFault 14.54 sec
Start 24: test_concurrent_hash_map
24/136 Test #24: test_concurrent_hash_map ................. Passed 0.45 sec
Start 25: test_task_arena
25/136 Test #25: test_task_arena ..........................Bus error***Exception: 9.09 sec
Start 26: test_enumerable_thread_specific
26/136 Test #26: test_enumerable_thread_specific .......... Passed 1.38 sec
Start 27: test_concurrent_queue
27/136 Test #27: test_concurrent_queue .................... Passed 1.06 sec
Start 28: test_resumable_tasks
28/136 Test #28: test_resumable_tasks .....................Bus error***Exception: 0.58 sec
Start 29: test_mutex
29/136 Test #29: test_mutex ............................... Passed 14.20 sec
Start 30: test_function_node
30/136 Test #30: test_function_node .......................Bus error***Exception: 20.92 sec
Start 31: test_multifunction_node
^C 31/136 Test #31: test_multifunction_node ..................***Exception: Interrupt763.10 sec
Start 32: test_broadcast_node
32/136 Test #32: test_broadcast_node ...................... Passed 0.46 sec
Start 33: test_buffer_node
33/136 Test #33: test_buffer_node ......................... Passed 0.42 sec
Start 34: test_composite_node
34/136 Test #34: test_composite_node ...................... Passed 26.74 sec
Start 35: test_continue_node
^C 35/136 Test #35: test_continue_node .......................***Exception: Interrupt181.78 sec
Start 36: test_eh_flow_graph
36/136 Test #36: test_eh_flow_graph ....................... Passed 4.69 sec
Start 37: test_flow_graph
37/136 Test #37: test_flow_graph .......................... Passed 0.53 sec
Start 38: test_flow_graph_priorities
38/136 Test #38: test_flow_graph_priorities ............... Passed 1.25 sec
Start 39: test_flow_graph_whitebox
39/136 Test #39: test_flow_graph_whitebox ................. Passed 0.06 sec
Start 40: test_indexer_node
40/136 Test #40: test_indexer_node ........................ Passed 2.30 sec
Start 41: test_join_node
41/136 Test #41: test_join_node ........................... Passed 5.83 sec
Start 42: test_join_node_key_matching
42/136 Test #42: test_join_node_key_matching .............. Passed 2.44 sec
Start 43: test_join_node_msg_key_matching
43/136 Test #43: test_join_node_msg_key_matching .......... Passed 0.27 sec
Start 44: test_join_node_msg_key_matching_n_args
44/136 Test #44: test_join_node_msg_key_matching_n_args ... Passed 2.15 sec
Start 45: test_join_node_preview
45/136 Test #45: test_join_node_preview ................... Passed 0.04 sec
Start 46: test_limiter_node
46/136 Test #46: test_limiter_node ........................ Passed 8.76 sec
Start 47: test_priority_queue_node
47/136 Test #47: test_priority_queue_node ................. Passed 0.06 sec
Start 48: test_queue_node
^C 48/136 Test #48: test_queue_node ..........................***Exception: Interrupt 79.83 sec
Start 49: test_sequencer_node
49/136 Test #49: test_sequencer_node ...................... Passed 0.32 sec
Start 50: test_split_node
50/136 Test #50: test_split_node .......................... Passed 0.99 sec
Start 51: test_tagged_msg
51/136 Test #51: test_tagged_msg .......................... Passed 0.02 sec
Start 52: test_overwrite_node
52/136 Test #52: test_overwrite_node ......................Bus error***Exception: 3.50 sec
Start 53: test_write_once_node
53/136 Test #53: test_write_once_node .....................Bus error***Exception: 8.72 sec
Start 54: test_async_node
54/136 Test #54: test_async_node ..........................Bus error***Exception: 6.65 sec
Start 55: test_input_node
55/136 Test #55: test_input_node .......................... Passed 0.27 sec
Start 56: test_profiling
56/136 Test #56: test_profiling ........................... Passed 0.03 sec
Start 57: test_concurrent_queue_whitebox
57/136 Test #57: test_concurrent_queue_whitebox ........... Passed 0.02 sec
Start 58: test_intrusive_list
58/136 Test #58: test_intrusive_list ...................... Passed 3.64 sec
Start 59: test_semaphore
59/136 Test #59: test_semaphore ........................... Passed 3.22 sec
Start 60: test_environment_whitebox
60/136 Test #60: test_environment_whitebox ................ Passed 0.10 sec
Start 61: test_hw_concurrency
61/136 Test #61: test_hw_concurrency ...................... Passed 0.01 sec
Start 62: test_eh_thread
62/136 Test #62: test_eh_thread ........................... Passed 0.86 sec
Start 63: test_global_control
63/136 Test #63: test_global_control ...................... Passed 0.49 sec
Start 64: test_task
64/136 Test #64: test_task ................................Bus error***Exception: 1.04 sec
Start 65: test_concurrent_monitor
65/136 Test #65: test_concurrent_monitor .................. Passed 0.29 sec
Start 66: test_scheduler_mix
66/136 Test #66: test_scheduler_mix ....................... Passed 66.25 sec
Start 67: test_handle_perror
67/136 Test #67: test_handle_perror ....................... Passed 0.04 sec
Start 68: test_arena_constraints
68/136 Test #68: test_arena_constraints ...................***Failed Required regular expression not found. Regex=[oneTBB: TBBBIND.*tbbbind_2_5
] 0.13 sec
Start 69: test_tbb_fork
69/136 Test #69: test_tbb_fork ............................ Passed 14.01 sec
Start 70: test_tbb_header
70/136 Test #70: test_tbb_header .......................... Passed 0.02 sec
Start 71: test_openmp
71/136 Test #71: test_openmp .............................. Passed 0.33 sec
Start 72: conformance_tick_count
72/136 Test #72: conformance_tick_count ................... Passed 0.01 sec
Start 73: conformance_allocators
73/136 Test #73: conformance_allocators ................... Passed 27.00 sec
Start 74: conformance_mutex
74/136 Test #74: conformance_mutex ........................Bus error***Exception: 5.03 sec
Start 75: conformance_task_group
75/136 Test #75: conformance_task_group ................... Passed 0.03 sec
Start 76: conformance_task_group_context
76/136 Test #76: conformance_task_group_context ........... Passed 0.02 sec
Start 77: conformance_task_arena
77/136 Test #77: conformance_task_arena ................... Passed 0.19 sec
Start 78: conformance_collaborative_call_once
78/136 Test #78: conformance_collaborative_call_once ...... Passed 0.03 sec
Start 79: conformance_concurrent_lru_cache
79/136 Test #79: conformance_concurrent_lru_cache ......... Passed 0.01 sec
Start 80: conformance_concurrent_unordered_map
80/136 Test #80: conformance_concurrent_unordered_map .....***Exception: SegFault 8.91 sec
Start 81: conformance_concurrent_unordered_set
81/136 Test #81: conformance_concurrent_unordered_set .....***Exception: SegFault 0.95 sec
Start 82: conformance_concurrent_map
82/136 Test #82: conformance_concurrent_map ...............Bus error***Exception: 1.17 sec
Start 83: conformance_concurrent_set
83/136 Test #83: conformance_concurrent_set ...............Bus error***Exception: 1.03 sec
Start 84: conformance_concurrent_priority_queue
84/136 Test #84: conformance_concurrent_priority_queue .... Passed 0.98 sec
Start 85: conformance_parallel_for
85/136 Test #85: conformance_parallel_for ................. Passed 14.32 sec
Start 86: conformance_parallel_for_each
86/136 Test #86: conformance_parallel_for_each ............ Passed 0.62 sec
Start 87: conformance_parallel_reduce
87/136 Test #87: conformance_parallel_reduce .............. Passed 2.85 sec
Start 88: conformance_parallel_scan
88/136 Test #88: conformance_parallel_scan ................ Passed 0.02 sec
Start 89: conformance_parallel_sort
89/136 Test #89: conformance_parallel_sort ................ Passed 0.10 sec
Start 90: conformance_parallel_pipeline
90/136 Test #90: conformance_parallel_pipeline ............ Passed 8.40 sec
Start 91: conformance_parallel_invoke
91/136 Test #91: conformance_parallel_invoke .............. Passed 1.31 sec
Start 92: conformance_blocked_range
92/136 Test #92: conformance_blocked_range ................ Passed 0.73 sec
Start 93: conformance_blocked_range2d
93/136 Test #93: conformance_blocked_range2d .............. Passed 5.89 sec
Start 94: conformance_blocked_range3d
94/136 Test #94: conformance_blocked_range3d .............. Passed 2.03 sec
Start 95: conformance_blocked_rangeNd
95/136 Test #95: conformance_blocked_rangeNd .............. Passed 0.91 sec
Start 96: conformance_concurrent_vector
96/136 Test #96: conformance_concurrent_vector ............ Passed 17.82 sec
Start 97: conformance_global_control
97/136 Test #97: conformance_global_control ............... Passed 0.37 sec
Start 98: conformance_concurrent_hash_map
98/136 Test #98: conformance_concurrent_hash_map .......... Passed 13.00 sec
Start 99: conformance_enumerable_thread_specific
99/136 Test #99: conformance_enumerable_thread_specific ... Passed 19.04 sec
Start 100: conformance_combinable
100/136 Test #100: conformance_combinable ...................Bus error***Exception: 14.37 sec
Start 101: conformance_concurrent_queue
101/136 Test #101: conformance_concurrent_queue .............SIGTRAP***Exception: 12.17 sec
Start 102: conformance_resumable_tasks
102/136 Test #102: conformance_resumable_tasks .............. Passed 0.02 sec
Start 103: conformance_version
103/136 Test #103: conformance_version ......................***Failed 0.02 sec
Start 104: conformance_function_node
104/136 Test #104: conformance_function_node ................ Passed 0.08 sec
Start 105: conformance_multifunction_node
105/136 Test #105: conformance_multifunction_node ........... Passed 0.07 sec
Start 106: conformance_input_node
106/136 Test #106: conformance_input_node ................... Passed 2.67 sec
Start 107: conformance_continue_node
107/136 Test #107: conformance_continue_node ................ Passed 0.02 sec
Start 108: conformance_async_node
108/136 Test #108: conformance_async_node ................... Passed 0.07 sec
Start 109: conformance_overwrite_node
109/136 Test #109: conformance_overwrite_node ............... Passed 0.02 sec
Start 110: conformance_write_once_node
110/136 Test #110: conformance_write_once_node .............. Passed 0.02 sec
Start 111: conformance_buffer_node
111/136 Test #111: conformance_buffer_node .................. Passed 0.02 sec
Start 112: conformance_queue_node
112/136 Test #112: conformance_queue_node ................... Passed 0.02 sec
Start 113: conformance_priority_queue_node
113/136 Test #113: conformance_priority_queue_node .......... Passed 0.02 sec
Start 114: conformance_sequencer_node
114/136 Test #114: conformance_sequencer_node ............... Passed 0.03 sec
Start 115: conformance_limiter_node
115/136 Test #115: conformance_limiter_node ................. Passed 0.02 sec
Start 116: conformance_broadcast_node
116/136 Test #116: conformance_broadcast_node ............... Passed 0.02 sec
Start 117: conformance_composite_node
117/136 Test #117: conformance_composite_node ............... Passed 0.02 sec
Start 118: conformance_indexer_node
118/136 Test #118: conformance_indexer_node ................. Passed 0.02 sec
Start 119: conformance_split_node
119/136 Test #119: conformance_split_node ................... Passed 0.02 sec
Start 120: conformance_join_node
120/136 Test #120: conformance_join_node .................... Passed 0.02 sec
Start 121: conformance_graph
121/136 Test #121: conformance_graph ........................ Passed 0.03 sec
Start 122: conformance_arena_constraints
122/136 Test #122: conformance_arena_constraints ............***Failed Required regular expression not found. Regex=[oneTBB: TBBBIND.*tbbbind_2_5
] 0.08 sec
Start 123: test_scalable_allocator
123/136 Test #123: test_scalable_allocator .................. Passed 17.68 sec
Start 124: test_malloc_pools
124/136 Test #124: test_malloc_pools ........................ Passed 7.70 sec
Start 125: test_malloc_init_shutdown
125/136 Test #125: test_malloc_init_shutdown ................ Passed 0.02 sec
Start 126: test_malloc_regression
126/136 Test #126: test_malloc_regression ................... Passed 0.07 sec
Start 127: test_malloc_shutdown_hang
127/136 Test #127: test_malloc_shutdown_hang ................ Passed 1.62 sec
Start 128: test_malloc_compliance
128/136 Test #128: test_malloc_compliance ................... Passed 32.15 sec
Start 129: test_malloc_used_by_lib
129/136 Test #129: test_malloc_used_by_lib ..................SIGTRAP***Exception: 0.62 sec
Start 130: test_malloc_lib_unload
130/136 Test #130: test_malloc_lib_unload ...................SIGTRAP***Exception: 0.40 sec
Start 131: test_malloc_pure_c
131/136 Test #131: test_malloc_pure_c ....................... Passed 0.32 sec
Start 132: test_malloc_whitebox
132/136 Test #132: test_malloc_whitebox .....................Bus error***Exception: 10.10 sec
Start 133: test_malloc_atexit
133/136 Test #133: test_malloc_atexit .......................SIGTRAP***Exception: 0.41 sec
Start 134: test_malloc_overload
134/136 Test #134: test_malloc_overload .....................Bus error***Exception: 0.47 sec
Start 135: test_malloc_overload_disable
135/136 Test #135: test_malloc_overload_disable ............. Passed 0.02 sec
Start 136: test_malloc_new_handler
136/136 Test #136: test_malloc_new_handler .................. Passed 0.04 sec
78% tests passed, 30 tests failed out of 136
Total Test time (real) = 1673.07 sec
The following tests FAILED:
7 - test_concurrent_unordered_map (Bus error)
8 - test_concurrent_unordered_set (Bus error)
9 - test_concurrent_map (SEGFAULT)
10 - test_concurrent_set (Bus error)
23 - test_task_group (SEGFAULT)
25 - test_task_arena (Bus error)
28 - test_resumable_tasks (Bus error)
30 - test_function_node (Bus error)
31 - test_multifunction_node (INTERRUPT)
35 - test_continue_node (INTERRUPT)
48 - test_queue_node (INTERRUPT)
52 - test_overwrite_node (Bus error)
53 - test_write_once_node (Bus error)
54 - test_async_node (Bus error)
64 - test_task (Bus error)
68 - test_arena_constraints (Failed)
74 - conformance_mutex (Bus error)
80 - conformance_concurrent_unordered_map (SEGFAULT)
81 - conformance_concurrent_unordered_set (SEGFAULT)
82 - conformance_concurrent_map (Bus error)
83 - conformance_concurrent_set (Bus error)
100 - conformance_combinable (Bus error)
101 - conformance_concurrent_queue (SIGTRAP)
103 - conformance_version (Failed)
122 - conformance_arena_constraints (Failed)
129 - test_malloc_used_by_lib (SIGTRAP)
130 - test_malloc_lib_unload (SIGTRAP)
132 - test_malloc_whitebox (Bus error)
133 - test_malloc_atexit (SIGTRAP)
134 - test_malloc_overload (Bus error)
Errors while running CTest
@reble @anton-potapov @alexey-katranov @vossmjp @pavelkumbrasev We really need this fixed now, since Stan
probabilistic platform depends on oneTBB
. (By “we” I mean not just myself but MacPorts users.)
Could someone please help me with reviewing this?
P. S. Again, please excuse me for taking so long to finalize this, but in result fixes are substantially improved and unnecessary changes are dropped.
Also, if someone can advise on fixing remaining tests, would be awesome. Quite likely it is wrong alignments, like Bool is somewhere unconditionally assumed to be 1 byte etc. (it is 4 bytes on Darwin ppc32).
Linking against libatomic is also needed on multiple 32-bit targets on Linux.
Currently we're explicitly linking against libatomic on mips32, m68k, powerpc32 and sh4 [1]:
# explicitly link against libatomic
ifneq (,$(filter $(DEB_HOST_ARCH),armel mipsel m68k sh4 powerpc))
export DEB_LDFLAGS_MAINT_APPEND = -Wl,--no-as-needed -latomic -Wl,--as-needed
endif
So, I think 268ada4ca917f06c9b8d7090426850ab34a425e1 should employ a test to determine whether the target platform needs to link against libatomic. For example, the mold linker uses the following trick [2]:
# If atomics doesn't work by default, add -latomic.
# We need the flag on riscv, armv6 and m68k.
include(CheckCXXSourceCompiles)
check_cxx_source_compiles("#include <atomic>
int main() {
std::atomic<uint8_t> w1;
std::atomic<uint16_t> w2;
std::atomic<uint32_t> w4;
std::atomic<uint64_t> w8;
return ++w1 + ++w2 + ++w4 + ++w8;
}" HAVE_FULL_ATOMIC_SUPPORT)
if(NOT HAVE_FULL_ATOMIC_SUPPORT)
target_link_libraries(mold PRIVATE atomic)
endif()
[1] https://salsa.debian.org/science-team/tbb/-/blob/master/debian/rules#L20 [2] https://github.com/rui314/mold/blob/main/CMakeLists.txt#L227
Linking against libatomic is also needed on multiple 32-bit targets on Linux.
Currently we're explicitly linking against libatomic on mips32, m68k, powerpc32 and sh4 [1]:
# explicitly link against libatomic ifneq (,$(filter $(DEB_HOST_ARCH),armel mipsel m68k sh4 powerpc)) export DEB_LDFLAGS_MAINT_APPEND = -Wl,--no-as-needed -latomic -Wl,--as-needed endif
So, I think 268ada4ca917f06c9b8d7090426850ab34a425e1 should employ a test to determine whether the target platform needs to link against libatomic. For example, the mold linker uses the following trick [2]:
# If atomics doesn't work by default, add -latomic. # We need the flag on riscv, armv6 and m68k. include(CheckCXXSourceCompiles) check_cxx_source_compiles("#include <atomic> int main() { std::atomic<uint8_t> w1; std::atomic<uint16_t> w2; std::atomic<uint32_t> w4; std::atomic<uint64_t> w8; return ++w1 + ++w2 + ++w4 + ++w8; }" HAVE_FULL_ATOMIC_SUPPORT) if(NOT HAVE_FULL_ATOMIC_SUPPORT) target_link_libraries(mold PRIVATE atomic) endif()
[1] https://salsa.debian.org/science-team/tbb/-/blob/master/debian/rules#L20
[2] https://github.com/rui314/mold/blob/main/CMakeLists.txt#L227
@glaubitz I have no idea how that works on Linux, to be honest, but on macOS, at least macOS PPC, explicit linking to libatomic
is a standard – and required – practice. We have numerous cases of this kind in Macports. I think this is GCC-specific, in fact, not even arch-specific.
@barracuda156 Can you test whether the following works for you:
diff --git a/cmake/compilers/GNU.cmake b/cmake/compilers/GNU.cmake
index cd76acfe..8282b54e 100644
--- a/cmake/compilers/GNU.cmake
+++ b/cmake/compilers/GNU.cmake
@@ -44,6 +44,20 @@ if (NOT MINGW)
set(TBB_COMMON_LINK_LIBS dl)
endif()
+include(CheckCXXSourceCompiles)
+check_cxx_source_compiles("#include <atomic>
+int main() {
+ std::atomic<uint8_t> w1;
+ std::atomic<uint16_t> w2;
+ std::atomic<uint32_t> w4;
+ std::atomic<uint64_t> w8;
+ return ++w1 + ++w2 + ++w4 + ++w8;
+}" HAVE_FULL_ATOMIC_SUPPORT)
+
+if(NOT HAVE_FULL_ATOMIC_SUPPORT)
+ set(TBB_COMMON_LINK_LIBS ${TBB_COMMON_LINK_LIBS} atomic)
+endif()
+
# Ignore -Werror set through add_compile_options() or added to CMAKE_CXX_FLAGS if TBB_STRICT is disabled.
if (NOT TBB_STRICT AND COMMAND tbb_remove_compile_flag)
tbb_remove_compile_flag(-Werror)
I have successfully tested it on 32-bit PowerPC on Linux.
@barracuda156 Can you test whether the following works for you:
diff --git a/cmake/compilers/GNU.cmake b/cmake/compilers/GNU.cmake index cd76acfe..8282b54e 100644 --- a/cmake/compilers/GNU.cmake +++ b/cmake/compilers/GNU.cmake @@ -44,6 +44,20 @@ if (NOT MINGW) set(TBB_COMMON_LINK_LIBS dl) endif() +include(CheckCXXSourceCompiles) +check_cxx_source_compiles("#include <atomic> +int main() { + std::atomic<uint8_t> w1; + std::atomic<uint16_t> w2; + std::atomic<uint32_t> w4; + std::atomic<uint64_t> w8; + return ++w1 + ++w2 + ++w4 + ++w8; +}" HAVE_FULL_ATOMIC_SUPPORT) + +if(NOT HAVE_FULL_ATOMIC_SUPPORT) + set(TBB_COMMON_LINK_LIBS ${TBB_COMMON_LINK_LIBS} atomic) +endif() + # Ignore -Werror set through add_compile_options() or added to CMAKE_CXX_FLAGS if TBB_STRICT is disabled. if (NOT TBB_STRICT AND COMMAND tbb_remove_compile_flag) tbb_remove_compile_flag(-Werror)
I have successfully tested it on 32-bit PowerPC on Linux.
@glaubitz Thank you! Allow me a day, I will test it for ppc32 and update you. (Away from PPC hardware atm, so cannot test ppc64: Rosetta in 10.6 only supports ppc32.)
@barracuda156 It shouldn't be required on 64-bit machines, only on 32-bit. 64-bit machines can do all the atomic operations in hardware without the need of a helper library.
FWIW, it's not a MacOS-related issue, but a GCC issue. There is even a GCC bug for it, see [1].
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358
@barracuda156 It shouldn't be required on 64-bit machines, only on 32-bit. 64-bit machines can do all the atomic operations in hardware without the need of a helper library.
FWIW, it's not a MacOS-related issue, but a GCC issue. There is even a GCC bug for it, see [1].
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358
@glaubitz I vaguely recall @iains mentioned elsewhere why GCC behaves that way, but unfortunately forgot related details. He can also say decisively on ppc64 on macOS, if he gets time to respond here. (I can try that out, but once I am back to my PowerMacs.)
@barracuda156 Well, if you just run a small compile test during cmake stage, you can actually dynamically verify whether a given target needs libatomic
or not. I think that's the best solution as it avoids having to add other targets in the future.
@barracuda156 Well, if you just run a small compile test during cmake stage, you can actually dynamically verify whether a given target needs
libatomic
or not. I think that's the best solution as it avoids having to add other targets in the future.
@glaubitz Yes, I agree. I verify and update you soon.
P. S. What do you think about static assert case above btw? If any ideas/suggestions.
@barracuda156 It shouldn't be required on 64-bit machines, only on 32-bit. 64-bit machines can do all the atomic operations in hardware without the need of a helper library.
ppc64/G5 can indeed do up to 64b lock-free. It needs the library for > 8bytes (again this is an ISA constraint, not a software one).
FWIW, it's not a MacOS-related issue, but a GCC issue. There is even a GCC bug for it, see [1].
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358
It seems that there has been some discussion ...
@glaubitz I vaguely recall @iains mentioned elsewhere why GCC behaves that way, but unfortunately forgot related details.
Well. I could have imagined that there would be reluctance to add libraries unilaterally since that puts a burden on code that does not need them (I don't recall reasoning beyond the at the moment) - but let's see what pans out in the PR.
Well. I could have imagined that there would be reluctance to add libraries unilaterally since that puts a burden on code that does not need them (I don't recall reasoning beyond the at the moment) - but let's see what pans out in the PR.
I agree. But if we actually use a compile test as suggested above this situation will never occur.
@barracuda156 > P. S. What do you think about static assert case above btw? If any ideas/suggestions.
Haven't seen this issue on 32-bit PowerPC on Linux. With the libatomic fix, the code builds fine and we have just one failure (#982).
@barracuda156 > P. S. What do you think about static assert case above btw? If any ideas/suggestions.
Haven't seen this issue on 32-bit PowerPC on Linux. With the libatomic fix, the code builds fine and we have just one failure (#982).
@glaubitz I have seen size- and alignment-related static assert failures on macOS PPC elsewhere too, these are likely ABI-related and not arch-related as such. (For example, 32-bit Bool is specific to Darwin ppc32.)
@barracuda156 Can you test whether the following works for you:
diff --git a/cmake/compilers/GNU.cmake b/cmake/compilers/GNU.cmake index cd76acfe..8282b54e 100644 --- a/cmake/compilers/GNU.cmake +++ b/cmake/compilers/GNU.cmake @@ -44,6 +44,20 @@ if (NOT MINGW) set(TBB_COMMON_LINK_LIBS dl) endif() +include(CheckCXXSourceCompiles) +check_cxx_source_compiles("#include <atomic> +int main() { + std::atomic<uint8_t> w1; + std::atomic<uint16_t> w2; + std::atomic<uint32_t> w4; + std::atomic<uint64_t> w8; + return ++w1 + ++w2 + ++w4 + ++w8; +}" HAVE_FULL_ATOMIC_SUPPORT) + +if(NOT HAVE_FULL_ATOMIC_SUPPORT) + set(TBB_COMMON_LINK_LIBS ${TBB_COMMON_LINK_LIBS} atomic) +endif() + # Ignore -Werror set through add_compile_options() or added to CMAKE_CXX_FLAGS if TBB_STRICT is disabled. if (NOT TBB_STRICT AND COMMAND tbb_remove_compile_flag) tbb_remove_compile_flag(-Werror)
I have successfully tested it on 32-bit PowerPC on Linux.
@glaubitz Yes, your suggestion works for macOS PPC. The test fails:
-- Performing Test HAVE_FULL_ATOMIC_SUPPORT
-- Performing Test HAVE_FULL_ATOMIC_SUPPORT - Failed
And then -latomic
is added:
/opt/local/bin/g++-mp-11 -pipe -Os -Wno-unused-function -Wno-parentheses -DNDEBUG -I/opt/local/include -D_GLIBCXX_USE_CXX11_ABI=0 -arch ppc -mmacosx-version-min=10.6 -dynamiclib -Wl,-headerpad_max_install_names -Wl,-exported_symbols_list,/opt/local/var/macports/build/_opt_PPCRosettaPorts_devel_onetbb/onetbb/work/oneTBB-2021.7.0/src/tbb/def/mac32-tbb.def -L/opt/local/lib -Wl,-headerpad_max_install_names -compatibility_version 12.0.0 -current_version 12.7.0 -o ../../gnu_11.3_cxx11_32_macports/libtbb.12.7.dylib -install_name /opt/local/lib/libtbb.12.dylib CMakeFiles/tbb.dir/address_waiter.cpp.o CMakeFiles/tbb.dir/allocator.cpp.o CMakeFiles/tbb.dir/arena.cpp.o CMakeFiles/tbb.dir/arena_slot.cpp.o CMakeFiles/tbb.dir/concurrent_bounded_queue.cpp.o CMakeFiles/tbb.dir/dynamic_link.cpp.o CMakeFiles/tbb.dir/exception.cpp.o CMakeFiles/tbb.dir/governor.cpp.o CMakeFiles/tbb.dir/global_control.cpp.o CMakeFiles/tbb.dir/itt_notify.cpp.o CMakeFiles/tbb.dir/main.cpp.o CMakeFiles/tbb.dir/market.cpp.o CMakeFiles/tbb.dir/misc.cpp.o CMakeFiles/tbb.dir/misc_ex.cpp.o CMakeFiles/tbb.dir/observer_proxy.cpp.o CMakeFiles/tbb.dir/parallel_pipeline.cpp.o CMakeFiles/tbb.dir/private_server.cpp.o CMakeFiles/tbb.dir/profiling.cpp.o CMakeFiles/tbb.dir/rml_tbb.cpp.o CMakeFiles/tbb.dir/rtm_mutex.cpp.o CMakeFiles/tbb.dir/rtm_rw_mutex.cpp.o CMakeFiles/tbb.dir/semaphore.cpp.o CMakeFiles/tbb.dir/small_object_pool.cpp.o CMakeFiles/tbb.dir/task.cpp.o CMakeFiles/tbb.dir/task_dispatcher.cpp.o CMakeFiles/tbb.dir/task_group_context.cpp.o CMakeFiles/tbb.dir/version.cpp.o CMakeFiles/tbb.dir/queuing_rw_mutex.cpp.o -Wl,-rpath,/opt/local/lib -ldl -latomic
On Dec 11, 2022, at 1:30 PM, Sergey Fedorov @.***> wrote: @barracuda156 commented on this pull request.
In test/conformance/conformance_allocators.cpp:
@@ -31,13 +31,15 @@ TEST_CASE("Allocator concept") { TestAllocator<oneapi::tbb::cache_aligned_allocator
>(Concept); TestAllocator<oneapi::tbb::tbb_allocator >(Concept);
- #ifndef POWERPC // This test fails to build on Darwin PPC.
@glaubitz Yes, you are right, it is better to consider that case. I have rebased now, adding APPLE in the condition here and also moving to your code in GNU.cmake.You can just drop that commit for the atomics check as I have already a PR with that open in a cleaned-up variant.See #987
You can just drop that commit for the atomics check as I have already a PR with that open in a cleaned-up variant.See #987
@glaubitz Okay, awesome. Dropping it now.
Rebased to fix copyright years: apparently now 2005–2023 is required.
@reble @vossmjp @dnmokhov Could someone take a look here, please? Let us start the year with fixing PPC :)
@barracuda156 you don't need to correct copyright year in files those you didn't change. Please revert copyright year in such files.
@barracuda156 you don't need to correct copyright year in files those you didn't change. Please revert copyright year in such files.
@pavelkumbrasev Fixed, thank you!
@reble @pavelkumbrasev @glaubitz Rebased to current master.
UPD. Let me drop cosmetics commit and improve a bit here and there. I will update soon, hopefully.
@anton-potapov @glaubitz @pavelkumbrasev @reble Could someone review this? I hope we can merge this soon and not keep carrying over and rebasing forever :)
This commit has broken oneTBB
on older Darwin: https://github.com/oneapi-src/oneTBB/commit/0f0d1cb994733cde5f93e34b1fe845f334e4240c
Could we fix that? New code should not be used on 10.6 and below and likely should not be used with GCC, since blocks are not supported in it, and libdispatch needs blocks.
@anton-potapov @reble @pavelkumbrasev @glaubitz Rebased again to the current master.
However now broken semaphore.h
is to be fixed.
@barracuda156 could you please take a look at https://github.com/oneapi-src/oneTBB/blob/master/SYSTEM_REQUIREMENTS.md#limitations With regard to new section in system requirements we can create a separate branch in this repo that will accommodate add ppc32 support. What do you think?