hpx icon indicating copy to clipboard operation
hpx copied to clipboard

Simd sort

Open Johan511 opened this issue 1 year ago • 8 comments

Enabling vectorization for sorting using 3rd party library (https://github.com/intel/x86-simd-sort)

(Work in Progress)

Johan511 avatar Aug 19 '23 13:08 Johan511

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??-

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T13:19:41+00:00
HPX Commitdcb541576898d370113946ba15fb58c20c8325b2cd65a806000094b5f31cd3a6e9752009fb33a735
Clusternamerostamrostam
Envfile
Datetime2023-05-10T14:50:18.616050-05:002023-08-19T08:25:36.834748-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch=

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T13:19:41+00:00
HPX Commitdcb541576898d370113946ba15fb58c20c8325b2cd65a806000094b5f31cd3a6e9752009fb33a735
Clusternamerostamrostam
Envfile
Datetime2023-05-10T14:52:35.047119-05:002023-08-19T08:27:50.442351-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)=
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T13:19:41+00:00
HPX Commitdcb541576898d370113946ba15fb58c20c8325b2cd65a806000094b5f31cd3a6e9752009fb33a735
Clusternamerostamrostam
Envfile
Datetime2023-05-10T14:52:52.237641-05:002023-08-19T08:28:07.336369-05:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

StellarBot avatar Aug 19 '23 13:08 StellarBot

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??-

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T18:38:27+00:00
HPX Commitdcb541576898d370113946ba15fb58c20c8325b224dfcd5c69aa95b68b4be8b2c3de23cea6aa6d99
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Datetime2023-05-10T14:50:18.616050-05:002023-08-19T13:45:18.850938-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T18:38:27+00:00
HPX Commitdcb541576898d370113946ba15fb58c20c8325b224dfcd5c69aa95b68b4be8b2c3de23cea6aa6d99
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Datetime2023-05-10T14:52:35.047119-05:002023-08-19T13:47:31.102556-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)(=)(=)

Info

PropertyBeforeAfter
HPX Datetime2023-05-10T12:07:53+00:002023-08-19T18:38:27+00:00
HPX Commitdcb541576898d370113946ba15fb58c20c8325b224dfcd5c69aa95b68b4be8b2c3de23cea6aa6d99
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Datetime2023-05-10T14:52:52.237641-05:002023-08-19T13:47:48.033631-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

StellarBot avatar Aug 19 '23 18:08 StellarBot

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b2b60c175dcead0a8576859035baf24bb81b4821b4
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T06:23:47+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:50:18.616050-05:002023-08-29T01:31:05.004560-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b2b60c175dcead0a8576859035baf24bb81b4821b4
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T06:23:47+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:52:35.047119-05:002023-08-29T01:33:17.791388-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale=(=)=
Stream Benchmark - Triad(=)=(=)
Stream Benchmark - Copy(=)-(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b2b60c175dcead0a8576859035baf24bb81b4821b4
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T06:23:47+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:52:52.237641-05:002023-08-29T01:33:34.692225-05:00
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

StellarBot avatar Aug 29 '23 06:08 StellarBot

image

Speedup observed for simd-sort

Johan511 avatar Aug 29 '23 21:08 Johan511

@hkaiser There are multiple issues trying to add this integrate this feature into HPX

  1. We need to add feature tests for AVX512
  2. Some vector intrinsic functions are not supported in certain architectures and as a result throw errors. In case of medusa a certain vector load instruction related to 16bit integers was not supported which leads to compile time failure of the target application even if i16 is not being sorted
  3. similar to 2nd point, some architectures might not support certain datatypes (Eg: _Float16) which can cause compile time failures in HPX
  4. If HPX is compiled with simd sort, the target application must be compiled with -march=native (or all the required flags to support vectorization). Else the compilation of the target application will fail

The library has different files for sorting 16bit, 32bit, 64bit numbers, I was considering adding a feature test for each of these files and including them with ifdef guards around them. This would still be an issue if the user compiles HPX and the HPX application (target) and different architectures. This might also lead to some false positives (features which are supported might not be available to user).

I would like your opinion on if I should proceed with the above design.

Johan511 avatar Aug 29 '23 21:08 Johan511

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b20c93a3b853046c0d10b1f0ba3b7225c85800499e
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:45:09+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:50:18.616050-05:002023-08-29T16:57:31.036372-05:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b20c93a3b853046c0d10b1f0ba3b7225c85800499e
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:45:09+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:52:35.047119-05:002023-08-29T16:59:43.227039-05:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)=
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)-(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b20c93a3b853046c0d10b1f0ba3b7225c85800499e
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:45:09+00:00
Clusternamerostamrostam
Datetime2023-05-10T14:52:52.237641-05:002023-08-29T17:00:00.111855-05:00
Envfile
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

StellarBot avatar Aug 29 '23 22:08 StellarBot

Performance test report

HPX Performance

Comparison

BENCHMARKFORK_JOIN_EXECUTORPARALLEL_EXECUTORSCHEDULER_EXECUTOR
For Each(=)??(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b225b4e45923f079fe7367d5855b20ba1ac6d45f5e
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:58:26+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Datetime2023-05-10T14:50:18.616050-05:002023-08-29T17:07:47.004681-05:00

Comparison

BENCHMARKNO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b225b4e45923f079fe7367d5855b20ba1ac6d45f5e
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:58:26+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Datetime2023-05-10T14:52:35.047119-05:002023-08-29T17:09:59.091186-05:00

Comparison

BENCHMARKFORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATORPARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATORSCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add(=)(=)(=)
Stream Benchmark - Scale(=)(=)(=)
Stream Benchmark - Triad(=)(=)(=)
Stream Benchmark - Copy(=)-(=)

Info

PropertyBeforeAfter
HPX Commitdcb541576898d370113946ba15fb58c20c8325b225b4e45923f079fe7367d5855b20ba1ac6d45f5e
HPX Datetime2023-05-10T12:07:53+00:002023-08-29T21:58:26+00:00
Compiler/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Clusternamerostamrostam
Hostnamemedusa08.rostam.cct.lsu.edumedusa08.rostam.cct.lsu.edu
Envfile
Datetime2023-05-10T14:52:52.237641-05:002023-08-29T17:10:15.975599-05:00

Explanation of Symbols

SymbolMEANING
=No performance change (confidence interval within ±1%)
(=)Probably no performance change (confidence interval within ±2%)
(+)/(-)Very small performance improvement/degradation (≤1%)
+/-Small performance improvement/degradation (≤5%)
++/--Large performance improvement/degradation (≤10%)
+++/---Very large performance improvement/degradation (>10%)
?Probably no change, but quite large uncertainty (confidence interval with ±5%)
??Unclear result, very large uncertainty (±10%)
???Something unexpected…

StellarBot avatar Aug 29 '23 22:08 StellarBot

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-69.13%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (103a7b8e3719a0db948d1abde29de0ff91e070be) 190583 162311 85.17%
Head commit (1c18b4a29c8806e696b4139e117f2b5448a2285b) 188842 (-1741) 30284 (-132027) 16.04% (-69.13%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#6326) 0 0 ∅ (not applicable)

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

codacy-production[bot] avatar Sep 14 '23 15:09 codacy-production[bot]