c-mera
c-mera copied to clipboard
OpenCL inline-code support
I would like to emit OpenCL programs into strings. This is an example:
const char *source =
"\
enum{N=100}; \
const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE| \
CLK_ADDRESS_CLAMP_TO_EDGE| \
CLK_FILTER_LINEAR; \
__kernel void red(int n, __write_only image2d_t rgba, \
__read_only image3d_t vol) \
{ \
int x=get_global_id(0), y=get_global_id(1); \
float4 \
start=(float4)(x,y,0,0), \
target=(float4)(64,64,64,0), \
delta=normalize(target-start), \
val=0.0; \
int i=0; \
for(i=-N;i<N;i++) \
val+=read_imagef(vol,sampler,start+i*2*delta); \
write_imagef(rgba,(int2)(x,y),val.xxxx/3); \
}";
I hacked something together using comment which seems to work. I'm sure it can be improved.
(lisp
(defun replace-newline-with-backslash-newline (string)
;; this is from common lisp cookbook i got it from here:
;; http://stackoverflow.com/questions/4366668/str-replace-in-lisp
;; i modified it to only search for newlines
(let ((part #\Newline)
(replacement "\\
"))
(with-output-to-string (out)
(loop
for old-pos = 0 then (+ pos 1)
for pos = (position part string
:start old-pos
:test #'char=)
do (write-string string out
:start old-pos
:end (or pos (cl:length string)))
when pos do (write-string replacement out)
while pos)))))
(use-variables CLK_NORMALIZED_COORDS_FALSE
CLK_ADDRESS_CLAMP_TO_EDGE
CLK_FILTER_LINEAR)
(loop for e in '(__read_only __write_only __kernel)
do (cgen:add-qualifier e))
(defmacro make-float4 (x &optional (y 0) (z 0) (w 0))
;; FIXME x,y,z,w only works with symbols. not with constants
`(cast float4 (comment
(lisp (format nil "(~{~a~^,~})" (list ,x ,y ,z ,w)))
:prefix "")))
(defmacro make-int2 (x &optional (y 0))
`(cast int2 (comment
(lisp (format nil "(~{~a~^,~})" (list ,x ,y)))
:prefix "")))
(simple-print
(comment
(format nil "const char *source = \"\\~%~a\""
(replace-newline-with-backslash-newline
(with-output-to-string (*standard-output*)
(simple-print
(function integrate_along_ray ((int n)
(__write_only image2d_t rgba)
(__read_only image3d_t vol))
-> |__KERNEL VOID|
(decl ((const sampler_t sampler (\| CLK_NORMALIZED_COORDS_FALSE
CLK_ADDRESS_CLAMP_TO_EDGE
CLK_FILTER_LINEAR))
(int x (funcall get_global_id 0))
(int y (funcall get_global_id 1))
(const int n 64)
(const float val1 0.0))
(decl ((float4 start (make-float4 x y))
(float4 val val1)
(float4 target (make-float4 n n n)))
(decl ((float4 delta (funcall normalize (- target start))))
(for ((int i (- 100)) (< i 100) i++)
(+= val (funcall read_imagef vol sampler
(+ start (* 2 i delta)))))
(funcall write_imagef rgba (make-int2 x y) (/ (oref val xxxx) 3)))))
))))) :prefix ""))
This expands into:
const char *source = "\
\
__kernel void integrate_along_ray(int n, __write_only image2d_t rgba, __read_only image3d_t vol)\
{\
const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP_TO_EDGE | CLK_FILTER_LINEAR;\
int x = get_global_id(0);\
int y = get_global_id(1);\
const int n = 64;\
const float val1 = 0.00000;\
float4 start = ((float4)\
(x,y,0,0));\
float4 val = val1;\
float4 target = ((float4)\
(n,n,n,0));\
float4 delta = normalize(target - start);\
for(int i = -100; i < 100; i++){\
val += read_imagef(vol, sampler, (start + (2 * i * delta)));\
}\
write_imagef(rgba, ((int2)\
(x,y)), val.xxxx / 3);\
}"
Hi, yes, your solution using comment is the way we do things like this. :)
I usually prefer to have a separate source file for stuff like that, but if you want to have strings containing printed code a certain amount of cruft is required, but possible. I played a little with it and I really like your solution. Looking for some primitive we could possibly put in the cg-user package to be more reusable I landed at
(defmacro codestring (&body body)
`(comment ,(format nil "\"~a\""
(replace-newline-with-backslash-newline
(with-output-to-string (*standard-output*)
(eval `(simple-print (progn ,@body))))))
:prefix ""))
It can be used as follows
(decl ((const char *foo
(codestring
(defkernel foo ()
(return))))))
Btw, I'm not a big fan of our return syntax when multiple symbols make up the type (though I don't have an alternative I could propose), so especially for the underscore-heavy compute languages I have things like
(defmacro defkernel (name param &body body)
`(function ,name ,param -> (__kernel void)
,@body))
(this requires (cgen::add-qualifier '__kernel)
)
@lispbub Do you have a better solution than the above? Do you think we should add this stand-alone, or maybe in support of another target, ocl-gen
? :)
Now we have an additional generator: oclgen. Currently it's just a prototype which adds some basic qualifiers but I'm in the midst of adding more ocl elements. Now the example above should also run in a seperate file: (codestring is defined in cgu-base.lisp)
defmacro make-float4 (x &optional (y 0) (z 0) (w 0))
;; FIXME x,y,z,w only works with symbols. not with constants
`(cast float4 (comment
(lisp (format nil "(~{~a~^,~})" (list ,x ,y ,z ,w)))
:prefix "")))
(defmacro make-int2 (x &optional (y 0))
`(cast int2 (comment
(lisp (format nil "(~{~a~^,~})" (list ,x ,y)))
:prefix "")))
(defmacro defkernel (name param &body body)
`(function ,name ,param -> (__kernel void)
,@body))
(decl ((const char *foo
(codestring
(defkernel foo () (return))))
(const char *bar
(codestring
(defkernel integrate_along_ray ((int n)
(__write_only image2d_t rgba)
(__read_only image3d_t vol))
(decl ((const sampler_t sampler (\| CLK_NORMALIZED_COORDS_FALSE
CLK_ADDRESS_CLAMP_TO_EDGE
CLK_FILTER_LINEAR))
(int x (funcall get_global_id 0))
(int y (funcall get_global_id 1))
(const int n 64)
(const float val1 0.0))
(decl ((float4 start (make-float4 x y))
(float4 val val1)
(float4 target (make-float4 n n n)))
(decl ((float4 delta (funcall normalize (- target start))))
(for ((int i (- 100)) (< i 100) i++)
(+= val (funcall read_imagef vol sampler
(+ start (* 2 i delta)))))
(funcall write_imagef rgba (make-int2 x y) (/ (oref val xxxx) 3))))))))))
Cool! How about extending this so that it is possible to define strings containing ocl programs in c++ or cuda source? I.e. how about changing the generator for a subform? Is that even possible?
Regarding the literal syntax I think we will need an extension of cast
or something along those lines to support ocl int2 x = ((int2) (1 2));
with (just a suggestion) (decl ((int2 x (int2 1 2))))
.
In that case int2
et al would just be macros to expand to casts, or, possibly, to something like a clist
.
Is it really necessary to run a different generator for subforms? Can't we simply derive oclgen from cugen? By the way oclgen is derived from cxxgen.
And the vector-initialization: (decl ((int2 x (int2 1 2))))
now expands to int2 x = (int2)(1 2);
without additional macro-hacks.
Cool, looks good :)
IIRC ocl ist more like C, not C++, so I thought we derive it from cgen, not cxxgen. Otherwise we drag a lot of invalid syntax from cxxgen to oclgen, but then one might want to define ocl kernels, in the ocl C-language, in a C++ programm, which would make it necessary to switch generators (given that I remember correctly, otherwise it is fine). I think we can probably argue for people not wanting to define ocl code in cuda files (this was just a provocative example). If this is super unpractical in our implementation we can of course think about alternative solutions, just wanted to know how crazy an idea it actually is :)