Marcel Koester

Results 6 issues of Marcel Koester

This PR adds a new meta optimizer to the Algorithms library which is designed for CPUs. It implements an optimization-performance and runtime-performance optimized version of the SGO algorithm: _Squid Game...

feature

This PR adds native hardware acceleration for 128bit vectors for `Velocity`. This backend is referred to as `Velocity128`. It features SSE and ARM64 Neon hardware acceleration for X64 and Arm64...

feature

This PR adds a variety of fixed-numbers numbers to ILGPU.Algorithms. In contrast to our fixed-precision types, these types are based on 10-based decimal places instead of powers of two. This...

feature

This PR updates the `ControlFlowVerifier` within out verification pipeline to use internal traversal methods instead of its own recursive traversal algorithm. This avoids running into stack overflows in cases the...

bug

### Describe the bug Our latest stable release does not support Cuda 12 SDK which causes issues with Blas in certain cases. ### Environment - ILGPU version: 1.5.1 - .NET...

bug

This PR adds support for 256bit registers and AVX instruction acceleration to our Velocity backend.

feature