c-mera icon indicating copy to clipboard operation
c-mera copied to clipboard

OpenCL inline-code support

Open plops opened this issue 9 years ago • 6 comments

I would like to emit OpenCL programs into strings. This is an example:

const char *source = 
  "\
enum{N=100};                                \
const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE|          \
                          CLK_ADDRESS_CLAMP_TO_EDGE|            \
                          CLK_FILTER_LINEAR;                \
__kernel void red(int n, __write_only image2d_t rgba,                   \
                  __read_only image3d_t vol)                \
{                                                                       \
  int x=get_global_id(0), y=get_global_id(1);               \
  float4                                \
    start=(float4)(x,y,0,0),                        \
    target=(float4)(64,64,64,0),                    \
    delta=normalize(target-start),                  \
    val=0.0;                                \
  int i=0;                              \
  for(i=-N;i<N;i++)                         \
    val+=read_imagef(vol,sampler,start+i*2*delta);          \
  write_imagef(rgba,(int2)(x,y),val.xxxx/3);                \
}";

I hacked something together using comment which seems to work. I'm sure it can be improved.

(lisp
 (defun replace-newline-with-backslash-newline (string)
   ;; this is from common lisp cookbook i got it from here:
   ;; http://stackoverflow.com/questions/4366668/str-replace-in-lisp
   ;; i modified it to only search for newlines
   (let ((part #\Newline)
     (replacement "\\
"))
     (with-output-to-string (out)
       (loop
      for old-pos = 0 then (+ pos 1)
      for pos = (position part string
                  :start old-pos
                  :test #'char=)
      do (write-string string out
               :start old-pos
               :end (or pos (cl:length string)))
      when pos do (write-string replacement out)
      while pos)))))
(use-variables CLK_NORMALIZED_COORDS_FALSE
           CLK_ADDRESS_CLAMP_TO_EDGE
           CLK_FILTER_LINEAR)


(loop for e in '(__read_only __write_only __kernel)
     do (cgen:add-qualifier e))

(defmacro make-float4 (x &optional (y 0) (z 0) (w 0))
  ;; FIXME x,y,z,w only works with symbols. not with constants 
  `(cast float4 (comment
         (lisp (format nil "(~{~a~^,~})" (list ,x ,y ,z ,w)))
         :prefix "")))
(defmacro make-int2 (x &optional (y 0))
  `(cast int2 (comment
           (lisp (format nil "(~{~a~^,~})" (list ,x ,y)))
           :prefix "")))
(simple-print
 (comment
  (format nil "const char *source = \"\\~%~a\""
      (replace-newline-with-backslash-newline
       (with-output-to-string (*standard-output*)
         (simple-print
          (function integrate_along_ray ((int n)
                         (__write_only image2d_t rgba)
                         (__read_only image3d_t vol))
          -> |__KERNEL VOID|
        (decl ((const sampler_t sampler (\| CLK_NORMALIZED_COORDS_FALSE
                            CLK_ADDRESS_CLAMP_TO_EDGE
                            CLK_FILTER_LINEAR))
               (int x (funcall get_global_id 0))
               (int y (funcall get_global_id 1))
               (const int n 64)
               (const float val1 0.0))
          (decl ((float4 start (make-float4 x y))
             (float4 val val1)
             (float4 target (make-float4 n n n)))
            (decl ((float4 delta (funcall normalize (- target start))))
              (for ((int i (- 100)) (< i 100) i++)
            (+= val (funcall read_imagef vol sampler
                     (+ start (* 2 i delta)))))
              (funcall write_imagef rgba (make-int2 x y) (/ (oref val xxxx) 3)))))
        ))))) :prefix ""))

This expands into:

const char *source = "\
\
__kernel void integrate_along_ray(int n, __write_only image2d_t rgba, __read_only image3d_t vol)\
{\
    const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP_TO_EDGE | CLK_FILTER_LINEAR;\
    int x = get_global_id(0);\
    int y = get_global_id(1);\
    const int n = 64;\
    const float val1 = 0.00000;\
    float4 start = ((float4)\
    (x,y,0,0));\
    float4 val = val1;\
    float4 target = ((float4)\
    (n,n,n,0));\
    float4 delta = normalize(target - start);\
    for(int i = -100; i < 100; i++){\
        val += read_imagef(vol, sampler, (start + (2 * i * delta)));\
    }\
    write_imagef(rgba, ((int2)\
    (x,y)), val.xxxx / 3);\
}"

plops avatar Sep 07 '15 12:09 plops

Hi, yes, your solution using comment is the way we do things like this. :)

I usually prefer to have a separate source file for stuff like that, but if you want to have strings containing printed code a certain amount of cruft is required, but possible. I played a little with it and I really like your solution. Looking for some primitive we could possibly put in the cg-user package to be more reusable I landed at

(defmacro codestring (&body body)
  `(comment ,(format nil "\"~a\""
                     (replace-newline-with-backslash-newline
                      (with-output-to-string (*standard-output*)
                        (eval `(simple-print (progn ,@body))))))
            :prefix ""))

It can be used as follows

(decl ((const char *foo
              (codestring
                (defkernel foo () 
                  (return))))))

Btw, I'm not a big fan of our return syntax when multiple symbols make up the type (though I don't have an alternative I could propose), so especially for the underscore-heavy compute languages I have things like

(defmacro defkernel (name param &body body)
  `(function ,name ,param -> (__kernel void)
     ,@body))

(this requires (cgen::add-qualifier '__kernel))

kiselgra avatar Sep 09 '15 02:09 kiselgra

@lispbub Do you have a better solution than the above? Do you think we should add this stand-alone, or maybe in support of another target, ocl-gen? :)

kiselgra avatar Sep 09 '15 02:09 kiselgra

Now we have an additional generator: oclgen. Currently it's just a prototype which adds some basic qualifiers but I'm in the midst of adding more ocl elements. Now the example above should also run in a seperate file: (codestring is defined in cgu-base.lisp)

defmacro make-float4 (x &optional (y 0) (z 0) (w 0))
    ;; FIXME x,y,z,w only works with symbols. not with constants 
    `(cast float4 (comment
                   (lisp (format nil "(~{~a~^,~})" (list ,x ,y ,z ,w)))
                   :prefix "")))
(defmacro make-int2 (x &optional (y 0))
 `(cast int2 (comment
              (lisp (format nil "(~{~a~^,~})" (list ,x ,y)))
              :prefix "")))

(defmacro defkernel (name param &body body)
  `(function ,name ,param -> (__kernel void)
       ,@body))

    (decl ((const char *foo 
            (codestring
             (defkernel foo () (return))))

           (const char *bar
            (codestring
             (defkernel integrate_along_ray ((int n)
                                             (__write_only image2d_t rgba)
                                             (__read_only image3d_t vol))
              (decl ((const sampler_t sampler (\| CLK_NORMALIZED_COORDS_FALSE
                                               CLK_ADDRESS_CLAMP_TO_EDGE
                                               CLK_FILTER_LINEAR))
                     (int x (funcall get_global_id 0))
                     (int y (funcall get_global_id 1))
                     (const int n 64)
                     (const float val1 0.0))
               (decl ((float4 start (make-float4 x y))
                      (float4 val val1)
                      (float4 target (make-float4 n n n)))
                (decl ((float4 delta (funcall normalize (- target start))))
                 (for ((int i (- 100)) (< i 100) i++)
                  (+= val (funcall read_imagef vol sampler
                           (+ start (* 2 i delta)))))
                 (funcall write_imagef rgba (make-int2 x y) (/ (oref val xxxx) 3))))))))))

lispbub avatar Sep 15 '15 15:09 lispbub

Cool! How about extending this so that it is possible to define strings containing ocl programs in c++ or cuda source? I.e. how about changing the generator for a subform? Is that even possible?

Regarding the literal syntax I think we will need an extension of cast or something along those lines to support ocl int2 x = ((int2) (1 2)); with (just a suggestion) (decl ((int2 x (int2 1 2)))). In that case int2 et al would just be macros to expand to casts, or, possibly, to something like a clist.

kiselgra avatar Sep 16 '15 04:09 kiselgra

Is it really necessary to run a different generator for subforms? Can't we simply derive oclgen from cugen? By the way oclgen is derived from cxxgen.

And the vector-initialization: (decl ((int2 x (int2 1 2)))) now expands to int2 x = (int2)(1 2); without additional macro-hacks.

lispbub avatar Sep 16 '15 09:09 lispbub

Cool, looks good :)

IIRC ocl ist more like C, not C++, so I thought we derive it from cgen, not cxxgen. Otherwise we drag a lot of invalid syntax from cxxgen to oclgen, but then one might want to define ocl kernels, in the ocl C-language, in a C++ programm, which would make it necessary to switch generators (given that I remember correctly, otherwise it is fine). I think we can probably argue for people not wanting to define ocl code in cuda files (this was just a provocative example). If this is super unpractical in our implementation we can of course think about alternative solutions, just wanted to know how crazy an idea it actually is :)

kiselgra avatar Sep 17 '15 06:09 kiselgra