HVM icon indicating copy to clipboard operation
HVM copied to clipboard

CUDA not available or Failed to Launch Kernels (error code invalid argument)

Open kings177 opened this issue 9 months ago • 15 comments

References: https://github.com/HigherOrderCO/Bend/issues/320 by rubenjr0 with contents:

" Hello! I think I've encountered a bug. When running this example from the readme:

def sum(depth, x):
  switch depth:
    case 0:
      return x
    case _:
      fst = sum(depth-1, x*2+0) # adds the fst half
      snd = sum(depth-1, x*2+1) # adds the snd half
      return fst + snd
    
def main:
  return sum(30, 0)

The output is 0. I've tried bend run, bend run-c, and bend gen-cu (bend run-cu says cuda is not available, so I manually compile it with nvcc).

The output on my machine when running sum(24, 0) is 8388608, but on equivalent Haskell and Python programs the programs return 140737479966720. The results start to diverge when depth>=13.

I was wondering what could be causing these issues, both the incorrect result when depth>=13, and the result=0 when depth>=25.

My computer specs:

OS: Pop_OS 22.04 CPU: AMD Ryzen 5 2600x (12 cores) GPU: NVIDIA RTX 4060 ti (16GB) "

kings177 avatar May 17 '24 17:05 kings177