Engininja2 comments

Results 17 comments of


                                            Engininja2

DRAFT: Introduction of CUDA Graphs to LLama.cpp

`nodes`, `paramsDriver`, and `paramsRuntime` are being used across multiple calls of the function but their data is only loaded in an earlier call. Should they be static?

Add UC Berkleys Large World Models

The text models seem to just be llama so you can use convert.py for those

AMD MI300 GPU (gfx942) is lower than expected

There are a few things I can think of that could be slowing you down. First is `LLAMA_HIP_UMA=1` is for integrated graphics in the CPU, and will slow down actual...

HIP SDK with AMD iGPU rocBLAS error

Windows doesn't support `HSA_OVERRIDE_GFX_VERSION` and probably doesn't have its own equivalent. You would need to compile a Tensile library for gfx1103 for rocBLAS 5.7, or use Linux.

HIP SDK with AMD iGPU rocBLAS error

Unlike RDNA2 where everything is more or less gfx1030 RDNA3 ISAs have significant differences. In the linked comment '(more than "-ngl 32" resulted in gibberish)'. You could try offloading 1...

[User] AMD GPU slower than CPU

The RX 560 may be slower in part because it's using the fallback code for `__dp4a()` and its isa lacks a corresponding opcode and the compiler may not be choosing...

Can not offload layers to GPU (llama3)

Could this be from newlines in your shell? You might be running `./main -m /models/Meta-Llama-3-70B-Instruct.Q4_K_M.gguf -r ''` and then separately trying to run `--in-prefix "\nuser\n\n"` and so on.

Compilation error using HIP SDK on Windows

`__shfl_xor()` for half2 was added in ROCm 5.6. You could install the newer HIP SDK version 5.7 and use that instead, or try this PR: #7263

Windows ROCm Build.

You can set them as environment variables before running cmake, or you can pass them as arguments. ```cmd cmake -B build -G "Ninja" -DCMAKE_C_COMPILER=clang.exe -DCMAKE_CXX_COMPILER=clang++.exe -DLLAMA_HIPBLAS=ON -DCMAKE_BUILD_TYPE=Release ``` If the...

Windows ROCm Build.

After trying it, even if you build llama.cpp on Windows without the HIP SDK bin folder in your path (C:\Program Files\AMD\ROCm\5.5\bin\) the resulting executables won't run because they can't find...