Hüseyin Tuğrul BÜYÜKIŞIK
Hüseyin Tuğrul BÜYÜKIŞIK
Disabling any "stream" or "zero copy" should make it 8 times faster for your system. 2x RX480 are on best PCI-e slots and others are raised from 4x or 1x...
ok, two cpu issue must be: amd-app version + intel's own implementation. so I was right about 1x riser but I didn't expect all of them being 1x. Also these...
Nice open air case for overclocking. PCI-e 2.0 at 1x mode should be 300-400 MB/s in reality and only for big arrays(at least 8-10 MB). So its normal. But when...
v1.3.3 will have fully functional "task pool" feature that you can feed independent workloads to it and it schedules them to idle GPUs to keep them busy, even if they...
v1.4.1_update4 now properly targets OpenCL 1.2. This should work for some failing functions such as atom_xchg() in kernel codes. I didn't try on Nvidia as I don't have any (yet).
__kernel void test(__global float * parameter1, __constant int * parameter2){} __kernel void test2(__global float * parameter1){} ClArray a=new float[1024]; a.name="parameter1"; ClArray b=new float[1024]; b.name="parameter2"; a.nextParam(b).compute(...," test test2 ",...); then test2...
Can't javascript use C++ compiled dll to compute critical parts of collision? In C++, a single cpu thread can compute 30k AABBs' all-pair collisions at 60FPS using 3D octree-like structure....
And Mandelbrot set generation.
Randomly computing particles would add more latency not just because of random number generation but the non-cached access to particles. Cache line is wasted and the CPU has to do...
Some of implementations have method names starting with **threadSafe** and they should be tested but I am currently unable to because busy.