FelixCLC issues

Results 19 issues of


                                            FelixCLC

Possibility of adding support for Linux for Apple AMX1 and AMX2

Hi all, I'm in the process of researching Apple AMX as a potential way of speeding up IEEE FP BLAS kernels in OpenBLAS. On the MacOS side, it seems that...

Improve io performance by implementations the use of IO_uring

It may be possible to increase performance by using the io_uring asynchronous api within the program for the input read. example of basic setup here: https://twitter.com/axboe/status/1576671920488972288/photo/1 man page here: https://man.archlinux.org/man/io_uring.7

PDF/Printer friendly version of detailed posts for Accesibility

Hi Travis, Came across your post on AVX-512 (https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html) and after skimming through, seemed like your methodology and documentation was strong. Looking at other posts, this seems to hold true...

expanded `noinline` cannot compile anything past script 21, including script 22 in navi10

### Environment | Hardware | description | |----------|---------------| | GPU | -gfx1010-rx5600xt | | CPU | -12700k-AVX512 pcores only | | Software | version | |----------|---------| | OS | -...

Unable to build OneVPL-GPU: issue of scope for VAProcFilterCap3DLUT

## System information model name : 12th Gen Intel(R) Core(TM) i7-12700K 00:02.0 Display controller [0380]: Intel Corporation AlderLake-S GT1 [8086:4680] (rev 0c) 03:00.0 VGA compatible controller [0300]: Advanced Micro Devices,...

bug

Build

Cuda accelerated tonemap Filter

Initial idea of using POCL as a cuda translation layer isnt viable because of POCL not working with image formats on cuda. Currently reaching out to Yasroslav Pogrebnyak, the developer...

WIP

Parcing of host arguments to node appropriate, HWaccelerated variants

One of the requirements to pull this whole thing together will be the ability to parse arguments requested from the host and change to hardware accelerated versions. for example, if...

documentation

enhancement

WIP

Distribution infrastructure: UnicornTranscoder vs kube-plex

Have to find out if I'm better off using UT vs KP Kubernetes is borderline standard for ditributed compute these days. would make scalling out to more nodes much easier...

documentation

question

Haven't started coding

Interest in SDR<->HDR tonemapping inputs

Any interest in commands for SDRHDR tonemapping? There's versions for vulkan, oneapi, cuda, OpenCL, and CPU

enhancement

command