Hüseyin Tuğrul BÜYÜKIŞIK issues

Results 26 issues of


                                            Hüseyin Tuğrul BÜYÜKIŞIK

Is it possible to convert ImdiskHandleComm() into multi-threaded version?

I'm adding some multi-gpu stuff into it to use all GPUs for single drive. It works but it is slow because of multiple lock-guards and small paging size in the...

Does this have LRU caching layer (on RAM) between VRAM and disk access?

For example, I have an OpenCL-based VRAM virtual-array class that improves performance even for random-accesses even when the accesses are not in too big chunks: https://github.com/tugrul512bit/VirtualMultiArray/wiki/Cache-Hit-Ratio-Benchmark 64-threads access: ![LRU-64thread](https://raw.githubusercontent.com/tugrul512bit/VirtualMultiArray/main/benchmark_data/mN1rk8.png) Single-thread...

I have 2x K420 and 1x GT1030 but app sees 3x GT1030

Cuda version: https://i.snipboard.io/kUT6v8.jpg I tried all 3 options from dropdown menu and all used the GT1030. Is there a way to specify devices explicitly before running or just by using...

single device pipeline: kernel repeat option

Sometimes a kernel needs to be repeated such as a "fluid solver" with same global+local range values.

enhancement

single device pipeline: overlapping regions percentage in total latency

such as a 3 stage pipeline result: pipeline 1: 3ms, %25 overlapped pipeline 2: 1ms, totally hidden pipeline 3: 20ms, %8 overlapped total overlapping regions: %15 time saved: 2ms (will...

enhancement

ClArray.name to bind an array to a kernel parameter with exact spelling

this way, binding only necessary arrays to a kernel will be possible, instead of all arrays

enhancement

Device to device pipeline: enable mixed ordering of kernel arrays (in kernel function definition)

Then developers can have any order they want instead of just: __kernel void test(input1,input2,hidden1,hidden2,hidden3,output1,output2){} instead of using inputs+hiddens+outputs differently in the parameter building part, add all into a single array...

enhancement

Device to device pipeline: balancing load (kernel names) between neighboring stages

Moving kernel names from one stage to another to altering total latencies of stages to minimize total latency of pipeline / to increase throughput. Example: - checks all stages' timings....

enhancement

add built-in image-resizing method for png,gif and jpeg

uses compressor-decompressor methods

Epic

feature

Add built-in jpeg,gif,png decompression-recompression methods

so implementing an image-resizer will be faster

Epic

feature