codon icon indicating copy to clipboard operation
codon copied to clipboard

gpu support fail running python

Open iligra opened this issue 2 years ago • 1 comments

Hi, I built codon v0.16.1 (https://github.com/exaloop/codon/releases/tag/v0.16.1) from source tar ball and ran sucsessfully gpu examples using codon. However, when I tried to run a modified code using python I got error: CUDA error at .../codon-0.16.1/codon/runtime/gpu.cpp:102: named symbol not found

Actually I tried 2 modifications, which seem to be reasonable to me. In both cases got the same error. I prefer the first template, as it runs 25-30% faster than the second (using codon). Also I tried to remove external functions and variables from the code, but this don't help.

Code Example1:

import codon

@codon.jit(debug=True)
def run_mandelbrot(pixels):
    import gpu

    @gpu.kernel
    def mandelbrot(pixels):
        idx = (gpu.block.x * gpu.block.dim.x) + gpu.thread.x
        i, j = divmod(idx, 4096)
        a1 = -2.00
        b1 =  0.47
        a2 = -1.12
        b2 =  1.12
        s1 = a1 + (j/4096)*(b1 - a1)
        s2 = a2 + (i/4096)*(b2 - a2)
        c = complex(s1, s2)
        z = 0j
        iteration = 0

        while abs(z) <= 2 and iteration < 1000:
            z = z**2 + c
            iteration += 1

        pixels[idx] = int(255 * iteration/1000)

    mandelbrot(pixels, grid=(4096*4096)//1024, block=1024)
    return pixels

import numpy as np
import matplotlib.pyplot as plt
from time import time

t0 = time()
pixels = [0 for _ in range(4096 * 4096)]
pixels = run_mandelbrot(pixels)
t1 = time()
print(f'[codon] took {t1 - t0} seconds')

plt.imshow(np.array(pixels).reshape(4096, 4096))
plt.show()

Code Example2:

import codon

@codon.jit(debug=True)
def run_mandelbrot(pixels):
    _@par(gpu=True,collapse=2)
    for i in range(4096):
        for j in range(4096):
            a1 = -2.00
            b1 =  0.47
            a2 = -1.12
            b2 =  1.12
            s1 = a1 + (j/4096)*(b1 - a1)
            s2 = a2 + (i/4096)*(b2 - a2)
            c = complex(s1, s2)
            z = 0j
            iteration = 0

            while abs(z) <= 2 and iteration < 1000:
                z = z**2 + c
                iteration += 1
                pixels[i*4096 + j] = int(255 * iteration/1000)

    return pixels

import numpy as np
import matplotlib.pyplot as plt
from time import time

t0 = time()
pixels = [0 for _ in range(4096 * 4096)]
pixels = run_mandelbrot(pixels)
t1 = time()
print(f'[codon] took {t1 - t0} seconds')

plt.imshow(np.array(pixels).reshape(4096, 4096))
plt.show()

It would be nice if these code examples could run successfully. Thank you.

iligra avatar Jun 14 '23 09:06 iligra

Chiming @arshajii who can help with this one.

inumanag avatar Jul 26 '23 13:07 inumanag