Carsten Bauer

Results 99 comments of Carsten Bauer

First test kernel: ```julia function kernel_wmma_bf16_lowlevel(a_dev, b_dev, c_dev, d_dev) a_frag = WMMA.llvm_wmma_load_a_col_m16n16k16_global_stride_bf16(pointer(a_dev), 16) b_frag = WMMA.llvm_wmma_load_b_col_m16n16k16_global_stride_bf16(pointer(b_dev), 16) c_frag = WMMA.llvm_wmma_load_c_col_m16n16k16_global_stride_f32(pointer(c_dev), 16) d_frag = WMMA.llvm_wmma_mma_col_col_m16n16k16_bf16(a_frag, b_frag, c_frag) WMMA.llvm_wmma_store_d_col_m16n16k16_global_stride_f32(pointer(d_dev), d_frag, 16) return...

I can't add the "cuda kernels" and "enhancement" labels (being a member of JuliaGPU is apparently not enough).

Comment: I'm not sure what the fragment type (`map_ptx_to_jl_frag`) and size (`map_frag_sizes`) should be...

After latest commit (fixing the fragment sizes) I (still) get ```julia julia> call_kernel() ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS) Stacktrace: [1] throw_api_error(res::CUDA.cudaError_enum) @ CUDA...

First test kernel: ```julia function kernel_wmma_f64_lowlevel(a_dev, b_dev, c_dev, d_dev) a_frag = WMMA.llvm_wmma_load_a_col_m8n8k4_global_stride_f64(pointer(a_dev), 8) b_frag = WMMA.llvm_wmma_load_b_col_m8n8k4_global_stride_f64(pointer(b_dev), 4) c_frag = WMMA.llvm_wmma_load_c_col_m8n8k4_global_stride_f64(pointer(c_dev), 8) d_frag = WMMA.llvm_wmma_mma_col_col_m8n8k4_f64(a_frag, b_frag, c_frag) WMMA.llvm_wmma_store_d_col_m8n8k4_global_stride_f64(pointer(d_dev), d_frag, 8) return...

After latest commit (fixing the fragment sizes) I get ```julia julia> call_kernel() ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS) Stacktrace: [1] throw_api_error(res::CUDA.cudaError_enum) @ CUDA /scratch/pc2-mitarbeiter/bauerc/devel/CUDA_f64/lib/cudadrv/error.jl:91...

I'm currently attempting something like this in a package (we could add this feature to CUDA.jl afterwards). However, I wonder how you would implement this to reliably achieve peak performance?...

+1 for this. Would also be nice for switching to the current nightly or a julia with `-J mycustomsysimg.dll`. I reckon this should be rather trivial to implement, no?

I would really love to see this. Quite regularily I find myself in a position where I want to use a different Julia binary in Juno than the default one....

Yes. And perhaps a custom name field to give the binaries nice names like "Julia 1.3.1 MKL".