Mikael Simberg issues

Results 232 issues of


                                            Mikael Simberg

Add CI configurations with stdexec

Nowadays `stdexec` should be feature-complete for DLA-Future's use cases. We should test that it works equivalently and add CI configurations for: - [x] GCC ( #930) - [x] clang (#1024)...

Priority:Medium

Category:std::execution

Attempt to decrease binary size of `with_temporary_tile` test

The `with_temporary_tile` test is currently by far the largest test, at 175 MB (with CUDA): https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/4700071344751697/7514005670787789/-/jobs/5289553001#L3156. While not critical, it may be worth looking into whether it's possible to decrease...

Priority:Low

Type:Refactoring

Type:Cleanup

Reduction to band miniapp prints negative flops when matrix size is equal to band size

For example: ``` [0] [0] 0.00350707s -204.11GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8 GPU [1] [1] 0.000240678s -2974.21GFlop/s d (1024, 1024) (1024, 1024) 1024 (1, 1) 8...

Type:Bug

Priority:Low

Investigate throttling number of active algorithms ("unrolling factor")

Use some type of semaphore to limit the number of algorithms that can be scheduled concurrently instead of "unrolling" the full pipeline in one go. This may improve memory locality...

Priority:Medium

Type:Optimization

Optimize read-after-read access in `Matrix` subpipelines

See https://github.com/eth-cscs/DLA-Future/pull/898#discussion_r1238751326. In the worst case this may need support in `async_rw_mutex` in pika. Needs further investigation. Some investigation on where and if this actually could lead to a performance...

Priority:Medium

Type:Optimization

Don't retile matrix in reduction to band if not necessary

https://github.com/eth-cscs/DLA-Future/pull/908#discussion_r1234124259. Depends on #905.

Priority:Low

Type:Refactoring

Make choice of managed or unmanaged communicators explicit

E.g. like in https://github.com/eth-cscs/DLA-Future/pull/714#issuecomment-1310485151. Related to #712 and #714.

Type:New Feature

Priority:Low

Type:Refactoring

Add CI configuration completely without assertions

C.f. https://github.com/eth-cscs/DLA-Future/pull/834.

Type:New Feature

Category:CI

Priority:Low

Add CI configuration with newer clang

E.g. 15 or 16.

Type:New Feature

TODO:Task

Category:CI

Priority:Low

Allow using GPU build on non-GPU nodes

Umpire expects to find a GPU when it has been compiled with GPU support. If a GPU-enabled build of DLA-Future is used on a node without GPUs DLA-Future will fail...

Type:New Feature

Priority:Low