updating docs to include matrix-vector multiply example

Open akashkgarg opened this issue 3 years ago • 5 comments

As I work through how to speed-up some of functionality in the SciML codebases using multiple GPUs, I thought I'd add my small experiments as examples for other users of this package. Comments/feedback is welcome if the example(s) shown could be done better.

May 20 '21 17:05 akashkgarg

Codecov Report

Merging #918 (03192b6) into master (eb7c326) will decrease coverage by 0.00%. The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #918      +/-   ##
==========================================
- Coverage   77.00%   76.99%   -0.01%     
==========================================
  Files         121      121              
  Lines        7706     7708       +2     
==========================================
+ Hits         5934     5935       +1     
- Misses       1772     1773       +1

Impacted Files	Coverage Δ
lib/cusolver/CUSOLVER.jl	`82.00% <0.00%> (-1.34%)`	:arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update eb7c326...03192b6. Read the comment docs.

May 20 '21 18:05 codecov[bot]

Nice example! Any idea why the minimum times show a much more pronounced speed-up? It could be because CUDA.@sync does a synchronize(), so only synchronizes the current task. Maybe it should call device_synchronize() instead, but that's fairly costly.

May 21 '21 07:05 maleadt

@maleadt great question. I added device_synchronize() and it does reduce the variance quite a bit. The updated is probably a more reasonable implementation/benchmark.

May 24 '21 16:05 akashkgarg

@maleadt I added another example that does a reduction over a large array. Surprisingly, the multiple GPU case is significantly slower (although the maximum time is about 1/3 the single GPU case). Perhaps there is a better way to partition the data/computation than I'm doing here?

May 25 '21 19:05 akashkgarg

Nice examples !

Oct 11 '21 22:10 amontoison

CUDA.jl CUDA.jl copied to clipboard

updating docs to include matrix-vector multiply example

Codecov Report

CUDA.jl
CUDA.jl copied to clipboard