llama-node
llama-node copied to clipboard
[ASK] enable cuda with manual compilation
Hi,
According to the llama.cpp github repo, it's now possible to use cuda from nvidia gpu by using cuBLAS build.
So you may got where I want to come, how to do Manual compilation with using make or cmake args to enable LLAMA_CUBLAS
:)
greeting
Hi, I plan to support this CUDA feature in the near future. Thanks for your suggestion.
this issue is pending new llama sampling logic here https://github.com/Atome-FE/llama-node/issues/36
currently this issue is pending statically linking problem from llama.cpp cmake https://github.com/ggerganov/llama.cpp/pull/1128#issuecomment-1531661524
I can compile with this :
let command = command
.arg("..")
.arg("-DCMAKE_BUILD_TYPE=Release")
.arg("-DLLAMA_OPENBLAS=ON")
.arg("-DLLAMA_CUBLAS=ON")
.arg("-DLLAMA_SHARED_LIBS=ON")
.arg("-DLLAMA_STATIC=ON")
.arg("-DLLAMA_ALL_WARNINGS=OFF")
.arg("-DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF")
.arg("-DLLAMA_BUILD_TESTS=OFF")
.arg("-DLLAMA_BUILD_EXAMPLES=OFF")
.arg("-DCMAKE_POSITION_INDEPENDENT_CODE=ON");
but as a result, openblas is not activated, and cublas seems not be activated too, it's don't use my gpu
and worst, I've compiled llama.cpp from the github source, blas work fine, but clblast and cublas don't. it's put some data in the vram, but it's still don't use the gpu, only the cpu
So I think we should wait a little bit for an update from llama.cpp
@tchereau great work. i was stuggling for fPIC error for several hours and only solved by dynamic linking. thanks for your investigation. but its true that cublas flag doesnt accelerate the evaluation properly. we have to wait a while.
I can compile with this :
let command = command .arg("..") .arg("-DCMAKE_BUILD_TYPE=Release") .arg("-DLLAMA_OPENBLAS=ON") .arg("-DLLAMA_CUBLAS=ON") .arg("-DLLAMA_SHARED_LIBS=ON") .arg("-DLLAMA_STATIC=ON") .arg("-DLLAMA_ALL_WARNINGS=OFF") .arg("-DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF") .arg("-DLLAMA_BUILD_TESTS=OFF") .arg("-DLLAMA_BUILD_EXAMPLES=OFF") .arg("-DCMAKE_POSITION_INDEPENDENT_CODE=ON");
but as a result, openblas is not activated, and cublas seems not be activated too, it's don't use my gpu
and worst, I've compiled llama.cpp from the github source, blas work fine, but clblast and cublas don't. it's put some data in the vram, but it's still don't use the gpu, only the cpu
So I think we should wait a little bit for an update from llama.cpp
Hi, I think you can compile with this, but can you run it actually? I got an error related to cudaLaunchKernel. I think there should be something abnormal with static linking. As a consequence, I m going to just provide a way for dynamic linking, but will not offer it as default prebuilt binary.
I can compile with this :
let command = command .arg("..") .arg("-DCMAKE_BUILD_TYPE=Release") .arg("-DLLAMA_OPENBLAS=ON") .arg("-DLLAMA_CUBLAS=ON") .arg("-DLLAMA_SHARED_LIBS=ON") .arg("-DLLAMA_STATIC=ON") .arg("-DLLAMA_ALL_WARNINGS=OFF") .arg("-DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF") .arg("-DLLAMA_BUILD_TESTS=OFF") .arg("-DLLAMA_BUILD_EXAMPLES=OFF") .arg("-DCMAKE_POSITION_INDEPENDENT_CODE=ON");
but as a result, openblas is not activated, and cublas seems not be activated too, it's don't use my gpu and worst, I've compiled llama.cpp from the github source, blas work fine, but clblast and cublas don't. it's put some data in the vram, but it's still don't use the gpu, only the cpu So I think we should wait a little bit for an update from llama.cpp
Hi, I think you can compile with this, but can you run it actually? I got an error related to cudaLaunchKernel. I think there should be something abnormal with static linking. As a consequence, I m going to just provide a way for dynamic linking, but will not offer it as default prebuilt binary.
well yes I can compile, but can't run it
pnpm build:llama-cpp
> [email protected] build:llama-cpp /root/git/llama-node
> pnpm run --filter=@llama-node/llama-cpp cross-compile
> @llama-node/[email protected] cross-compile /root/git/llama-node/packages/llama-cpp
> rimraf @llama-node && tsx scripts/cross-compile.mts
info: component 'rust-std' for target 'x86_64-unknown-linux-gnu' is up to date
info: component 'rust-std' for target 'x86_64-unknown-linux-musl' is up to date
warning: /root/git/llama-node/Cargo.toml: unused manifest key: workspace.package.name
/bin/sh: 1: zig: not found
warning: /root/git/llama-node/Cargo.toml: unused manifest key: workspace.package.name
Blocking waiting for file lock on build directory
Compiling llama-sys v0.0.1 (/root/git/llama-node/packages/llama-cpp/llama-sys)
Compiling llama-node-cpp v0.1.0 (/root/git/llama-node/packages/llama-cpp)
warning: value assigned to `id` is never read
--> packages/llama-cpp/src/context.rs:189:17
|
189 | let mut id = 0;
| ^^
|
= help: maybe it is overwritten before being read?
= note: `#[warn(unused_assignments)]` on by default
warning: `llama-node-cpp` (lib) generated 1 warning
Finished release [optimized] target(s) in 55.77s
Compiling llama-sys v0.0.1 (/root/git/llama-node/packages/llama-cpp/llama-sys)
Compiling llama-node-cpp v0.1.0 (/root/git/llama-node/packages/llama-cpp)
warning: value assigned to `id` is never read
--> packages/llama-cpp/src/context.rs:189:17
|
189 | let mut id = 0;
| ^^
|
= help: maybe it is overwritten before being read?
= note: `#[warn(unused_assignments)]` on by default
warning: `llama-node-cpp` (lib) generated 1 warning
Finished release [optimized] target(s) in 1m 34s
nodejs:
node .
node:internal/modules/cjs/loader:1338
return process.dlopen(module, path.toNamespacedPath(filename));
^
Error: /root/git/llama-selfbot/node_modules/llama-node/node_modules/@llama-node/llama-cpp/@llama-node/llama-cpp.linux-x64-gnu.node: undefined symbol: cublasSetMathMode
at Module._extensions..node (node:internal/modules/cjs/loader:1338:18)
at Module.load (node:internal/modules/cjs/loader:1117:32)
at Module._load (node:internal/modules/cjs/loader:958:12)
at Module.require (node:internal/modules/cjs/loader:1141:19)
at require (node:internal/modules/cjs/helpers:110:18)
at Object.<anonymous> (/root/git/llama-selfbot/node_modules/llama-node/node_modules/@llama-node/llama-cpp/index.js:188:31)
at Module._compile (node:internal/modules/cjs/loader:1254:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1308:10)
at Module.load (node:internal/modules/cjs/loader:1117:32)
at Module._load (node:internal/modules/cjs/loader:958:12) {
code: 'ERR_DLOPEN_FAILED'
}
Node.js v18.16.0
and with clblast:
pnpm build:llama-cpp
> [email protected] build:llama-cpp /root/git/llama-node
> pnpm run --filter=@llama-node/llama-cpp cross-compile
> @llama-node/[email protected] cross-compile /root/git/llama-node/packages/llama-cpp
> rimraf @llama-node && tsx scripts/cross-compile.mts
info: component 'rust-std' for target 'x86_64-unknown-linux-gnu' is up to date
info: component 'rust-std' for target 'x86_64-unknown-linux-musl' is up to date
/bin/sh: 1: zig: not found
warning: /root/git/llama-node/Cargo.toml: unused manifest key: workspace.package.name
warning: /root/git/llama-node/Cargo.toml: unused manifest key: workspace.package.name
Blocking waiting for file lock on package cache
Blocking waiting for file lock on package cache
Blocking waiting for file lock on build directory
Compiling llama-sys v0.0.1 (/root/git/llama-node/packages/llama-cpp/llama-sys)
Compiling llama-node-cpp v0.1.0 (/root/git/llama-node/packages/llama-cpp)
warning: value assigned to `id` is never read
--> packages/llama-cpp/src/context.rs:189:17
|
189 | let mut id = 0;
| ^^
|
= help: maybe it is overwritten before being read?
= note: `#[warn(unused_assignments)]` on by default
warning: `llama-node-cpp` (lib) generated 1 warning
Finished release [optimized] target(s) in 49.25s
Compiling llama-sys v0.0.1 (/root/git/llama-node/packages/llama-cpp/llama-sys)
Compiling llama-node-cpp v0.1.0 (/root/git/llama-node/packages/llama-cpp)
warning: value assigned to `id` is never read
--> packages/llama-cpp/src/context.rs:189:17
|
189 | let mut id = 0;
| ^^
|
= help: maybe it is overwritten before being read?
= note: `#[warn(unused_assignments)]` on by default
warning: `llama-node-cpp` (lib) generated 1 warning
Finished release [optimized] target(s) in 1m 20s
nodejs:
node .
node:internal/modules/cjs/loader:1338
return process.dlopen(module, path.toNamespacedPath(filename));
^
Error: /root/git/llama-selfbot/node_modules/llama-node/node_modules/@llama-node/llama-cpp/@llama-node/llama-cpp.linux-x64-gnu.node: undefined symbol: clBuildProgram
at Module._extensions..node (node:internal/modules/cjs/loader:1338:18)
at Module.load (node:internal/modules/cjs/loader:1117:32)
at Module._load (node:internal/modules/cjs/loader:958:12)
at Module.require (node:internal/modules/cjs/loader:1141:19)
at require (node:internal/modules/cjs/helpers:110:18)
at Object.<anonymous> (/root/git/llama-selfbot/node_modules/llama-node/node_modules/@llama-node/llama-cpp/index.js:188:31)
at Module._compile (node:internal/modules/cjs/loader:1254:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1308:10)
at Module.load (node:internal/modules/cjs/loader:1117:32)
at Module._load (node:internal/modules/cjs/loader:958:12) {
code: 'ERR_DLOPEN_FAILED'
}
Node.js v18.16.0
same with .arg("-DLLAMA_OPENBLAS=ON")
only
undefined symbol: cblas_sgemm
in all cases, the problem seem to be library who are not linked
at this point, I can't help, because I don't have the knowledge in C/rust
oops, will reopen and leave this until static linking is ready someday... currently only provide a self built dynamic linking version.
https://github.com/Atome-FE/llama-node/pull/42/files here is the customizable build features for cargo. I will prepare another docs for this.
https://llama-node.vercel.app/docs/cuda a manual compilation guide has been provided here