warp icon indicating copy to clipboard operation
warp copied to clipboard

Support of shared memory

Open jinz2014 opened this issue 1 year ago • 7 comments

Can you please explain how to support shared memory in a kernel ? Does the warp compiler optimize a kernel with shared memory ? Thanks.

jinz2014 avatar Jul 16 '24 23:07 jinz2014

For example, a dot product of two arrays.

I suppose that the warp.dot() function computes a dot product of two vectors. Each vector is an element of an array.

jinz2014 avatar Jul 16 '24 23:07 jinz2014

Hi @jinz2014 . Warp doesn't support shared memory in kernels directly, but you are free to use shared memory in native function snippets: https://nvidia.github.io/warp/modules/differentiability.html#custom-native-functions

daedalus5 avatar Jul 17 '24 14:07 daedalus5

Hi @daedalus5 I see. Will developers need to compute local ID (i.e. threadIdx.x) in a thread block ? I think wp.tid() means global ID.

jinz2014 avatar Jul 17 '24 14:07 jinz2014

Are there functions for local ID, thread block size, thread block ID ?

jinz2014 avatar Jul 17 '24 14:07 jinz2014

Yes, wp.tid() is a global ID. We don't have functions in Python for those, but you should be able to access eg threadIdx.x in a native snippet as you would normally.

daedalus5 avatar Jul 17 '24 15:07 daedalus5

Does snippet support template type ?

snippet = 
'''
  __shared__ T sum[256]
'''

jinz2014 avatar Jul 18 '24 14:07 jinz2014

No, I don't think templates would work in snippets currently.

daedalus5 avatar Jul 18 '24 19:07 daedalus5