exo icon indicating copy to clipboard operation
exo copied to clipboard

Bug on Linux - Clang

Open johnykes opened this issue 1 year ago • 8 comments

Tried with all Llama models + tiny UI and I always get this. No other errors in exo terminal

Error: Failed to fetch completions: Error processing prompt (see logs with DEBUG>=2): Command '['clang', '-shared', '-march=native', '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-ffreestanding', '-nostdlib', '-', '-o', '/tmp/tmplghq47j4']' returned non-zero exit status 1.

johnykes avatar Nov 15 '24 16:11 johnykes

how about sharing the debug log?

veyselsahin avatar Nov 16 '24 17:11 veyselsahin

We really need to spend some time pinning this down. Happening quite a bit https://github.com/exo-explore/exo/issues/458 It's a mystery because we don't have the logs from clang.

I think we might need to test on a bunch of different hardware and dig in.

AlexCheema avatar Nov 18 '24 17:11 AlexCheema

Error: Failed to fetch completions: Error processing prompt (see logs with DEBUG>=2): OpenCL Compile Error

:29:76: error: call to 'exp2' is ambiguous ((__global half4)((data0+alu0))) = (half4)((((half)(val5.x))cast0(1/(exp2((cast0*((half)(-1.4426950408889634f))))+((half)(1.0f))))),(((half)(val5.y))cast1(1/(exp2((cast1*((half)(-1.4426950408889634f))))+((half)(1.0f))))),(((half)(val5.z))cast2(1/(exp2((cast2*((half)(-1.4426950408889634f))))+((half)(1.0f))))),(((half)(val5.w))cast3(1/(exp2((cast3*((half)(-1.4426950408889634f))))+((half)(1.0f)))))); ^~~~ cl_kernel.h:1473:24: note: candidate function float OVERLOADABLE exp2(float); ^ cl_kernel.h:1474:25: note: candidate function double OVERLOADABLE exp2(double); ^ cl_kernel.h:1475:25: note: candidate function float2 OVERLOADABLE exp2(float2); ^ cl_kernel.h:1477:25: note: candidate function float3 OVERLOADABLE exp2(float3); ^ cl_kernel.h:1479:25: note: candidate function float4 OVERLOADABLE exp2(float4); ^ cl_kernel.h:1480:25: note: candidate function float8 OVERLOADABLE exp2(float8); ^ cl_kernel.h:1481:26: note: candidate function float16 OVERLOADABLE exp2(float16); ^ cl_kernel.h:1482:26: note: candidate function double2 OVERLOADABLE exp2(double2); ^ cl_kernel.h:1484:26: note: candidate function double3 OVERLOADABLE exp2(double3); ^ cl_kernel.h:1486:26: note: candidate function double4 OVERLOADABLE exp2(double4); ^ cl_kernel.h:1487:26: note: candidate function double8 OVERLOADABLE exp2(double8); ^ cl_kernel.h:1488:27: note: candidate function double16 OVERLOADABLE exp2(double16);

zznatzz avatar Dec 02 '24 06:12 zznatzz

We really need to spend some time pinning this down. Happening quite a bit #458 It's a mystery because we don't have the logs from clang.

I think we might need to test on a bunch of different hardware and dig in.

Ok just check if nvidia drivers are installed I think that causes these issues. So if you have a nvidia GPU it is better to run it on a container (if you want to run it via Clang). Personally I suggest using lxc type containers they work near native performance. I haven't tried others so can't comment.

qwertystars avatar Dec 12 '24 05:12 qwertystars

Ok just check if nvidia drivers are installed I think that causes these issues. So if you have a nvidia GPU it is better to run >it on a container (if you want to run it via Clang). Personally I suggest using lxc type containers they work near native >performance. I haven't tried others so can't comment.

It wasn't in my case when i was testing things just to see how it all worked, not for performance reasons yet ( im new still getting feet wet ). This was a VM ( KVM based ), no GPU drivers at all and i saw the same sort thing. Had not had time to look much into it, but i saw others having similar issues ( like this thread ). Going to try latest this weekend since i see some of this was merged in.

Nurb4000 avatar Dec 12 '24 14:12 Nurb4000

IMG_20241214_103432 IMG_20241214_103450 def compile(self, src:str) -> bytes: # TODO: remove file write. sadly clang doesn't like the use of /dev/stdout here with tempfile.NamedTemporaryFile(delete=True) as output_file: subprocess.check_output(['clang', '-shared', *self.args, '-O2', '-Wall', '-Werror', '-x', 'c', '-fPIC', '-ffreestanding', '-nostdlib', '-', '-o', str(output_file.name)], input=src.encode('utf-8')) return pathlib.Path(output_file.name).read_bytes()

Guys would these help? Maybe we try without the -nostdlib? For this I think we need to make another fork of tinygrad? Screenshot_2024-12-14-10-45-54-317_com brave browser-edit set(link_options -nostdlib) # The compiler might handle calls to math builtins by generating calls to # the respective libc math functions, in which case we cannot use these # builtins in our implementations of these functions. We check that this is # not the case by trying to link an executable, since linking would fail due # to unresolved references with -nostdlib if calls to libc functions were # generated. # # We also had issues with soft-float float16 conversion functions using both # compiler-rt and libgcc, so we also check whether we can convert from and # to float16 without calls to compiler runtime functions by trying to link # an executable with -nostdlib.

So? Clang doesn't like soft-float conversion? Idk if this will help. I just found stuff.

qwertystars avatar Dec 14 '24 05:12 qwertystars

Any updates on this? I have the same error on RTX 4090 and RTX 3080 Ti

Failed to fetch completions: Error processing prompt (see logs with DEBUG>=2): OpenCL Compile Error

<kernel>:29:76: error: call to 'exp2' is ambiguous
  *((__global half4*)((data0+alu7))) = (half4)((((half)(val5.x))*cast0*(1/(exp2((cast0*((half)(-1.4426950408889634f))))+((half)(1.0f))))),(((half)(val5.y))*cast1*(1/(exp2((cast1*((half)(-1.4426950408889634f))))+((half)(1.0f))))),(((half)(val5.z))*cast2*(1/(exp2((cast2*((half)(-1.4426950408889634f))))+((half)(1.0f))))),(((half)(val5.w))*cast3*(1/(exp2((cast3*((half)(-1.4426950408889634f))))+((half)(1.0f))))));
                                                                           ^~~~
cl_kernel.h:1473:24: note: candidate function
float __OVERLOADABLE__ exp2(float);

SebastianVivoverse avatar Feb 05 '25 15:02 SebastianVivoverse

Any updates on this? I have the same error on RTX 4090 and RTX 3080 Ti

Ubuntu: 24.04.1 GPU: RTX 4060

I fixed this issue by reinstalling these tools. I think you might not have installed cuDNN, just like I did.

NVIDIA driver (verify with nvidia-smi) CUDA Toolkit (verify with nvcc --version) cuDNN Library ( verify with mnistCUDNN)

Then, run this command. HF_ENDPOINT=https://hf-mirror.com exo

realderp avatar Feb 21 '25 08:02 realderp