Tobias Ribizel

Results 105 issues of Tobias Ribizel

I'd like to investigate implementing a reduction for associative, but non-commutative operations. Related to https://github.com/NVIDIA/thrust/issues/1434 The this kind of algorithm comes in handy when establishing the global context in parsing...

CMake has (or has gained) many features that we have our custom workarounds for, which I think we should remove soon * CUDA device architectures can be auto-detected since CMake...

reg:build
reg:testing
reg:ci-cd
reg:documentation
mod:core
mod:cuda
reg:example
reg:benchmarking
type:preconditioner
type:matrix-format
1:ST:WIP
mod:hip
1:ST:need-feedback

A lot of the files we have in `common/cuda_hip` don't work standalone, but instead require to be included with certain other includes being available. Here I'm trying to change this,...

reg:build
reg:testing
mod:core
mod:cuda
reg:benchmarking
type:solver
type:preconditioner
type:matrix-format
1:ST:ready-for-review
mod:hip
type:factorization
type:reordering
reg:helper-scripts
type:multigrid
type:stopping-criteria

Not sure how specific we should be here, this is probably overkill

mod:cuda
type:matrix-format
mod:hip
reg:helper-scripts

There are different sizes inside the `index_set` that have unclear names, I want to propose `get_subset_size()` or `get_local_size()` and `get_superset_size()` or `get_global_size()` as well as `get_num_ranges()` instead of `get_num_subsets()`. See...

is:idea

I think the current factory parameter setup could use with a few improvements: - [x] The factory parameters generated by our macros are `mutable` by default! That's a big code...

is:idea

Still need to check performance on this, but we can really use the atomic operations based on OpenMP primitives.

mod:cuda
mod:openmp
type:solver
mod:hip
type:factorization

Allow DpcppExecutor to be constructed from a sycl::device, which enables sub-device usage required for #1373. I also added a handful of fixes for deprecation warnings

mod:core
type:solver
type:preconditioner
type:matrix-format
type:factorization
mod:dpcpp

As a tool for implementing reusable factories, this adds reusable functionality for all Csr permutation and transpose functions. It also takes a first step towards making `Permutation` the default representation...

reg:build
reg:testing
mod:core
mod:cuda
mod:reference
type:matrix-format
1:ST:ready-for-review
mod:hip

Currently ReferenceExecutor derives from OmpExecutor to inherit the allocation and copy functionality. That means that some OpenMP functionality needs to be compiled even with OpenMP disabled. Maybe we want to...

is:todo