cl-cuda
cl-cuda copied to clipboard
Any way to run on Windows?
Setting up cl-cuda
seems to hook into gcc
to create the FFI. GCC is well and good thanks to MSYS2/MinGW64, but apparently the CUDA toolkit and MinGW don't play nice together. Is there any way to set up cl-cuda
to use the Windows CUDA toolchain?
I did not try cl-cuda on Windows, but I suppose that if you could satisfy the following points, cl-cuda would run on Windows even natively without MSYS/MinGW help. How about these?
- Running NVCC on commaond line.
- Running external commands from Common Lisp.
- Calling libcuda.dll via CFFI.
nvcc
works, haven't tried it with actual input files but it is on the PATH. Will try to compile samples and see what happens.
nvcc
can be run through SBCL (sb-ext:run-program "nvcc" nil :search t)
.
Can't find libcuda.dll
on my system, even though I have a CUDA card and have installed the developer SDK. Is that a secondary dependency? I'll do some more research momentarily.
I can find cuda.lib
but no cuda.dll
.
Cupy https://github.com/pfnet/chainer/tree/master/cupy does the almost same thing with cl-cuda in Python, generating CUDA C codes, compiling them with NVCC and launching kernels, and it works on Windows as well, so it should be possible.
My installation was slightly borked due to the lack of a valid Visual Studio version. That problem is fixed and my environment is actually working now, but I still can't find the right DLL(s). Haven't taken a look at exactly what cupy does yet. Here are the DLLs I can find.
cublas64_75.dll
cudart32_75.dll
cudart64_75.dll
cufft64_75.dll
cufftw64_75.dll
cuinj32_75.dll
cuinj64_75.dll
curand64_75.dll
cusolver64_75.dll
cusparse64_75.dll
nppc64_75.dll
nppi64_75.dll
npps64_75.dll
nvblas64_75.dll
nvrtc64_75.dll
nvrtc-builtins64_75.dll
This https://developer.nvidia.com/cuda-faq says that needed to use the driver API is "nvcuda.dll" and it is included as part of the standard NVIDIA driver install. Would you find it in Windows system folders such as System32? Cl-cuda uses the driver API only.
Appears to work on SBCL for me.
* (ql:quickload :cffi)
To load "cffi":
Load 1 ASDF system:
cffi
; Loading "cffi"
........
(:CFFI)
* (cffi:load-foreign-library "nvcuda")
#<CFFI:FOREIGN-LIBRARY NVCUDA-523 "nvcuda">
Okay, then you should be able to load cl-cuda with nvcuda.dll.
(ql:quickload :cl-cuda)
Please set *nvcc-binary*
to the path to NVCC compiler and try to run some sample programs.
(setf cl-cuda:*nvcc-binary* #P"path\to\nvcc")
(ql:quickload :cl-cuda-examples)
(cl-cuda-examples.vector-add:main)
You may need to pass some options to nvcc via *nvcc-options*
, please let me know what you will get.
Can't even load the system in the first place thanks to an error groveling a file in cl-cuda
. The full stacktrace from SLIME, since I'm very unfamiliar with native integration in SBCL.
Couldn't execute "gcc": The system cannot find the file specified.
[Condition of type CFFI-GROVEL:GROVEL-ERROR]
Restarts:
0: [RETRY] Retry PROCESS-OP on #<CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">.
1: [ACCEPT] Continue, treating PROCESS-OP on #<CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel"> as having been successful.
2: [RETRY] Retry ASDF operation.
3: [CLEAR-CONFIGURATION-AND-RETRY] Retry ASDF operation after resetting the configuration.
4: [ABORT] Give up on "cl-cuda"
5: [RETRY] Retry SLIME REPL evaluation request.
--more--
Backtrace:
0: (CFFI-GROVEL:GROVEL-ERROR "~a" #<SIMPLE-ERROR "Couldn't execute ~S: ~A" {1006781B33}>)
1: ((FLET #:THUNK :IN CFFI-GROVEL:PROCESS-GROVEL-FILE))
2: (SB-IMPL::%WITH-STANDARD-IO-SYNTAX #<CLOSURE (FLET #:THUNK :IN CFFI-GROVEL:PROCESS-GROVEL-FILE) {9F2DDBB}>)
3: (CFFI-GROVEL:PROCESS-GROVEL-FILE #P"C:/Users/jaccarmac/software/quicklisp/local-projects/cl-cuda/src/driver-api/type-grovel.lisp" #P"C:/Users/jaccarmac/AppData/Local/cache/common-lisp/sbcl-1.3.6-win-x..
4: ((:METHOD ASDF/ACTION:PERFORM (CFFI-GROVEL::PROCESS-OP CFFI-GROVEL:GROVEL-FILE)) #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">) [fast-method]
5: ((SB-PCL::EMF ASDF/ACTION:PERFORM) #<unavailable argument> #<unavailable argument> #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">)
6: ((:METHOD ASDF/ACTION:PERFORM-WITH-RESTARTS :AROUND (T T)) #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">) [fast-method]
7: ((:METHOD ASDF/PLAN:PERFORM-PLAN (LIST)) ((#1=#<ASDF/LISP-ACTION:PREPARE-OP > . #2=#<ASDF/SYSTEM:SYSTEM "uiop">) (#<ASDF/LISP-ACTION:COMPILE-OP > . #2#) (#3=#<ASDF/LISP-ACTION:LOAD-OP > . #2#) (#1# . ..
8: ((FLET SB-C::WITH-IT :IN SB-C::%WITH-COMPILATION-UNIT))
9: ((:METHOD ASDF/PLAN:PERFORM-PLAN :AROUND (T)) ((#1=#<ASDF/LISP-ACTION:PREPARE-OP > . #2=#<ASDF/SYSTEM:SYSTEM "uiop">) (#<ASDF/LISP-ACTION:COMPILE-OP > . #2#) (#3=#<ASDF/LISP-ACTION:LOAD-OP > . #2#) (#..
10: ((FLET SB-C::WITH-IT :IN SB-C::%WITH-COMPILATION-UNIT))
11: ((:METHOD ASDF/PLAN:PERFORM-PLAN :AROUND (T)) #<ASDF/PLAN:SEQUENTIAL-PLAN {1003E29C63}> :VERBOSE NIL) [fast-method]
12: ((:METHOD ASDF/OPERATE:OPERATE (ASDF/OPERATION:OPERATION ASDF/COMPONENT:COMPONENT)) #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL) [fast-method]
13: ((SB-PCL::EMF ASDF/OPERATE:OPERATE) #<unused argument> #<unused argument> #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL)
14: ((LAMBDA NIL :IN ASDF/OPERATE:OPERATE))
15: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL) [fast-method]
16: ((SB-PCL::EMF ASDF/OPERATE:OPERATE) #<unused argument> #<unused argument> ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL)
17: ((LAMBDA NIL :IN ASDF/OPERATE:OPERATE))
18: (ASDF/CACHE:CALL-WITH-ASDF-CACHE #<CLOSURE (LAMBDA NIL :IN ASDF/OPERATE:OPERATE) {1003E1B22B}> :OVERRIDE NIL :KEY NIL)
19: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL) [fast-method]
20: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL) [fast-method]
21: (ASDF/OPERATE:LOAD-SYSTEM "cl-cuda" :VERBOSE NIL)
22: (QUICKLISP-CLIENT::CALL-WITH-MACROEXPAND-PROGRESS #<CLOSURE (LAMBDA NIL :IN QUICKLISP-CLIENT::APPLY-LOAD-STRATEGY) {1003D8125B}>)
23: (QUICKLISP-CLIENT::AUTOLOAD-SYSTEM-AND-DEPENDENCIES "cl-cuda" :PROMPT NIL)
24: ((:METHOD QL-IMPL-UTIL::%CALL-WITH-QUIET-COMPILATION (T T)) #<unavailable argument> #<CLOSURE (FLET QUICKLISP-CLIENT::QL :IN QUICKLISP-CLIENT:QUICKLOAD) {1004559C2B}>) [fast-method]
25: ((:METHOD QL-IMPL-UTIL::%CALL-WITH-QUIET-COMPILATION :AROUND (QL-IMPL:SBCL T)) #<QL-IMPL:SBCL {10066F0833}> #<CLOSURE (FLET QUICKLISP-CLIENT::QL :IN QUICKLISP-CLIENT:QUICKLOAD) {1004559C2B}>) [fast-me..
26: ((:METHOD QUICKLISP-CLIENT:QUICKLOAD (T)) #<unavailable argument> :PROMPT NIL :SILENT NIL :VERBOSE NIL) [fast-method]
27: (QL-DIST::CALL-WITH-CONSISTENT-DISTS #<CLOSURE (LAMBDA NIL :IN QUICKLISP-CLIENT:QUICKLOAD) {100453EAFB}>)
28: (SB-INT:SIMPLE-EVAL-IN-LEXENV (QUICKLISP-CLIENT:QUICKLOAD :CL-CUDA) #<NULL-LEXENV>)
29: (EVAL (QUICKLISP-CLIENT:QUICKLOAD :CL-CUDA))
30: (SWANK::EVAL-REGION "(ql:quickload :cl-cuda) ..)
31: ((LAMBDA NIL :IN SWANK-REPL::REPL-EVAL))
32: (SWANK-REPL::TRACK-PACKAGE #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {100453E25B}>)
33: (SWANK::CALL-WITH-RETRY-RESTART "Retry SLIME REPL evaluation request." #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {100453E1BB}>)
34: (SWANK::CALL-WITH-BUFFER-SYNTAX NIL #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {100453E19B}>)
35: (SWANK-REPL::REPL-EVAL "(ql:quickload :cl-cuda) ..)
36: (SB-INT:SIMPLE-EVAL-IN-LEXENV (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
37: (EVAL (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
38: (SWANK:EVAL-FOR-EMACS (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
39: (SWANK::PROCESS-REQUESTS NIL)
40: ((LAMBDA NIL :IN SWANK::HANDLE-REQUESTS))
41: ((LAMBDA NIL :IN SWANK::HANDLE-REQUESTS))
42: (SWANK/SBCL::CALL-WITH-BREAK-HOOK #<FUNCTION SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA NIL :IN SWANK::HANDLE-REQUESTS) {1003DD000B}>)
43: ((FLET SWANK/BACKEND:CALL-WITH-DEBUGGER-HOOK :IN "c:/Users/jaccarmac/.emacs.d/elpa/slime-20160614.1214/swank/sbcl.lisp") #<FUNCTION SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA NIL :IN SWANK::HANDLE-R..
44: (SWANK::CALL-WITH-BINDINGS ((*STANDARD-INPUT* . #1=#<SWANK/GRAY::SLIME-INPUT-STREAM {1003C7EB13}>) (*STANDARD-OUTPUT* . #2=#<SWANK/GRAY::SLIME-OUTPUT-STREAM {1003D8F743}>) (*TRACE-OUTPUT* . #2#) (*ERR..
45: (SWANK::HANDLE-REQUESTS #<SWANK::MULTITHREADED-CONNECTION {1003220523}> NIL)
46: ((FLET #:WITHOUT-INTERRUPTS-BODY-1161 :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
47: ((FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
48: ((FLET #:WITHOUT-INTERRUPTS-BODY-359 :IN SB-THREAD::CALL-WITH-MUTEX))
49: (SB-THREAD::CALL-WITH-MUTEX #<CLOSURE (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE) {9F2FB5B}> #<SB-THREAD:MUTEX "thread result lock" owner: #<SB-THREAD:THREAD "..
50: (SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE #<SB-THREAD:THREAD "repl-thread" RUNNING {1003DC8033}> NIL #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::SPAWN-REPL-THREAD) {1003DBFF9B}> (#<SB-THREAD:THREAD "re..
51: ("foreign function: #x42E6FC")
52: ("foreign function: #x40334E")
53: ("foreign function: #x8B6FE0")
Ah... grovel... I missed you mentioned first with nvcc. While I will think of some working around, how did you failed on MSYS2/MinGW64 at frist?
but apparently the CUDA toolkit and MinGW don't play nice together.
AFAICT (definitely not an expert systems programmer :-), NVIDIA distributes their dev environment as binaries, but provide .lib
s for MSVC instead of DLL's, which means you have to do low level lib twiddling to get them to link against MinGW's libc.
Is it possible to call nvcuda.dll from SBCL on MinGW?
- Running NVCC on commaond line.
- Running external commands from Common Lisp.
- Calling libcuda.dll via CFFI.
- Groveling cuda.h with gcc.
I suppose that MinGW has a feature to call DLLs as well as GNU libraries, though not familiar with its calling convension.
MinGW does use DLLs as its shared library format, but as I understand it they are linked to an old msvcr.dll
. In any case, here are the results from running SBCL from inside a MinGW64 shell.
Subprocess (:PROCESS #<SB-IMPL::PROCESS :EXITED 1>)
with command ("gcc" "-m64" "-o"
"C:\\Users\\jaccarmac\\AppData\\Local\\cache\\common-lisp\\sbcl-1.3.6-win-x64\\C\\Users\\jaccarmac\\software\\quicklisp\\local-projects\\cl-cuda\\src\\driver-api\\type-grovel__grovel-tmpGHU3ALSV.exe"
"-IC:/Users/jaccarmac/software/quicklisp/dists/quicklisp/software/cffi_0.17.1/"
"C:\\Users\\jaccarmac\\AppData\\Local\\cache\\common-lisp\\sbcl-1.3.6-win-x64\\C\\Users\\jaccarmac\\software\\quicklisp\\local-projects\\cl-cuda\\src\\driver-api\\type-grovel__grovel.c")
exited with error code 1
[Condition of type CFFI-GROVEL:GROVEL-ERROR]
Restarts:
0: [RETRY] Retry PROCESS-OP on #<CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">.
1: [ACCEPT] Continue, treating PROCESS-OP on #<CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel"> as having been successful.
2: [RETRY] Retry ASDF operation.
3: [CLEAR-CONFIGURATION-AND-RETRY] Retry ASDF operation after resetting the configuration.
4: [ABORT] Give up on "cl-cuda"
5: [RETRY] Retry SLIME REPL evaluation request.
--more--
Backtrace:
0: (CFFI-GROVEL:GROVEL-ERROR "~a" #<UIOP/RUN-PROGRAM:SUBPROCESS-ERROR {100614BC93}>)
1: ((FLET #:THUNK :IN CFFI-GROVEL:PROCESS-GROVEL-FILE))
2: (SB-IMPL::%WITH-STANDARD-IO-SYNTAX #<CLOSURE (FLET #:THUNK :IN CFFI-GROVEL:PROCESS-GROVEL-FILE) {9EEDDBB}>)
3: (CFFI-GROVEL:PROCESS-GROVEL-FILE #P"C:/Users/jaccarmac/software/quicklisp/local-projects/cl-cuda/src/driver-api/type-grovel.lisp" #P"C:/Users/jaccarmac/AppData/Local/cache/common-lisp/sbcl-1.3.6-win-x..
4: ((:METHOD ASDF/ACTION:PERFORM (CFFI-GROVEL::PROCESS-OP CFFI-GROVEL:GROVEL-FILE)) #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">) [fast-method]
5: ((SB-PCL::EMF ASDF/ACTION:PERFORM) #<unavailable argument> #<unavailable argument> #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">)
6: ((:METHOD ASDF/ACTION:PERFORM-WITH-RESTARTS :AROUND (T T)) #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">) [fast-method]
7: ((:METHOD ASDF/PLAN:PERFORM-PLAN (LIST)) ((#1=#<ASDF/LISP-ACTION:PREPARE-OP > . #2=#<ASDF/SYSTEM:SYSTEM "uiop">) (#<ASDF/LISP-ACTION:COMPILE-OP > . #2#) (#3=#<ASDF/LISP-ACTION:LOAD-OP > . #2#) (#1# . ..
8: ((FLET SB-C::WITH-IT :IN SB-C::%WITH-COMPILATION-UNIT))
9: ((:METHOD ASDF/PLAN:PERFORM-PLAN :AROUND (T)) ((#1=#<ASDF/LISP-ACTION:PREPARE-OP > . #2=#<ASDF/SYSTEM:SYSTEM "uiop">) (#<ASDF/LISP-ACTION:COMPILE-OP > . #2#) (#3=#<ASDF/LISP-ACTION:LOAD-OP > . #2#) (#..
10: ((FLET SB-C::WITH-IT :IN SB-C::%WITH-COMPILATION-UNIT))
11: ((:METHOD ASDF/PLAN:PERFORM-PLAN :AROUND (T)) #<ASDF/PLAN:SEQUENTIAL-PLAN {1003781C63}> :VERBOSE NIL) [fast-method]
12: ((:METHOD ASDF/OPERATE:OPERATE (ASDF/OPERATION:OPERATION ASDF/COMPONENT:COMPONENT)) #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL) [fast-method]
13: ((SB-PCL::EMF ASDF/OPERATE:OPERATE) #<unused argument> #<unused argument> #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL)
14: ((LAMBDA NIL :IN ASDF/OPERATE:OPERATE))
15: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL) [fast-method]
16: ((SB-PCL::EMF ASDF/OPERATE:OPERATE) #<unused argument> #<unused argument> ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL)
17: ((LAMBDA NIL :IN ASDF/OPERATE:OPERATE))
18: (ASDF/CACHE:CALL-WITH-ASDF-CACHE #<CLOSURE (LAMBDA NIL :IN ASDF/OPERATE:OPERATE) {100377322B}> :OVERRIDE NIL :KEY NIL)
19: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL) [fast-method]
20: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL) [fast-method]
21: (ASDF/OPERATE:LOAD-SYSTEM "cl-cuda" :VERBOSE NIL)
22: (QUICKLISP-CLIENT::CALL-WITH-MACROEXPAND-PROGRESS #<CLOSURE (LAMBDA NIL :IN QUICKLISP-CLIENT::APPLY-LOAD-STRATEGY) {100371125B}>)
23: (QUICKLISP-CLIENT::AUTOLOAD-SYSTEM-AND-DEPENDENCIES "cl-cuda" :PROMPT NIL)
24: ((:METHOD QL-IMPL-UTIL::%CALL-WITH-QUIET-COMPILATION (T T)) #<unavailable argument> #<CLOSURE (FLET QUICKLISP-CLIENT::QL :IN QUICKLISP-CLIENT:QUICKLOAD) {1003DE55FB}>) [fast-method]
25: ((:METHOD QL-IMPL-UTIL::%CALL-WITH-QUIET-COMPILATION :AROUND (QL-IMPL:SBCL T)) #<QL-IMPL:SBCL {10066F0833}> #<CLOSURE (FLET QUICKLISP-CLIENT::QL :IN QUICKLISP-CLIENT:QUICKLOAD) {1003DE55FB}>) [fast-me..
26: ((:METHOD QUICKLISP-CLIENT:QUICKLOAD (T)) #<unavailable argument> :PROMPT NIL :SILENT NIL :VERBOSE NIL) [fast-method]
27: (QL-DIST::CALL-WITH-CONSISTENT-DISTS #<CLOSURE (LAMBDA NIL :IN QUICKLISP-CLIENT:QUICKLOAD) {1003DC330B}>)
28: (SB-INT:SIMPLE-EVAL-IN-LEXENV (QUICKLISP-CLIENT:QUICKLOAD :CL-CUDA) #<NULL-LEXENV>)
29: (EVAL (QUICKLISP-CLIENT:QUICKLOAD :CL-CUDA))
30: (SWANK::EVAL-REGION "(ql:quickload :cl-cuda) ..)
31: ((LAMBDA NIL :IN SWANK-REPL::REPL-EVAL))
32: (SWANK-REPL::TRACK-PACKAGE #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {1003DC2A6B}>)
33: (SWANK::CALL-WITH-RETRY-RESTART "Retry SLIME REPL evaluation request." #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {1003DC29CB}>)
34: (SWANK::CALL-WITH-BUFFER-SYNTAX NIL #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {1003DC29AB}>)
35: (SWANK-REPL::REPL-EVAL "(ql:quickload :cl-cuda) ..)
36: (SB-INT:SIMPLE-EVAL-IN-LEXENV (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
37: (EVAL (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
38: (SWANK:EVAL-FOR-EMACS (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
39: (SWANK::PROCESS-REQUESTS NIL)
40: ((LAMBDA NIL :IN SWANK::HANDLE-REQUESTS))
41: ((LAMBDA NIL :IN SWANK::HANDLE-REQUESTS))
42: (SWANK/SBCL::CALL-WITH-BREAK-HOOK #<FUNCTION SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA NIL :IN SWANK::HANDLE-REQUESTS) {1003DC000B}>)
43: ((FLET SWANK/BACKEND:CALL-WITH-DEBUGGER-HOOK :IN "c:/Users/jaccarmac/.emacs.d/elpa/slime-20160614.1214/swank/sbcl.lisp") #<FUNCTION SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA NIL :IN SWANK::HANDLE-R..
44: (SWANK::CALL-WITH-BINDINGS ((*STANDARD-INPUT* . #1=#<SWANK/GRAY::SLIME-INPUT-STREAM {1003C76B13}>) (*STANDARD-OUTPUT* . #2=#<SWANK/GRAY::SLIME-OUTPUT-STREAM {1003D87DF3}>) (*TRACE-OUTPUT* . #2#) (*ERR..
45: (SWANK::HANDLE-REQUESTS #<SWANK::MULTITHREADED-CONNECTION {1003220523}> NIL)
46: ((FLET #:WITHOUT-INTERRUPTS-BODY-1161 :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
47: ((FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
48: ((FLET #:WITHOUT-INTERRUPTS-BODY-359 :IN SB-THREAD::CALL-WITH-MUTEX))
49: (SB-THREAD::CALL-WITH-MUTEX #<CLOSURE (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE) {9EEFB5B}> #<SB-THREAD:MUTEX "thread result lock" owner: #<SB-THREAD:THREAD "..
50: (SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE #<SB-THREAD:THREAD "repl-thread" RUNNING {1003DB8033}> NIL #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::SPAWN-REPL-THREAD) {1003DB7F9B}> (#<SB-THREAD:THREAD "re..
51: ("foreign function: #x42E6FC")
52: ("foreign function: #x40334E")
53: ("foreign function: #x2637DA0")
What does gcc return if directly executed? You would be able to find some error messages.
gcc -m64 -o C:\Users\jaccarmac\AppData\Local\cache\common-lisp\sbcl-1.3.6-win-x64\C\Users\jaccarmac\software\quicklisp\local-projects\cl-cuda\src\driver-api\type-grovel__grovel-tmpGHU3ALSV.exe -IC:/Users/jaccarmac/software/quicklisp/dists/quicklisp/software/cffi_0.17.1/ C:\Users\jaccarmac\AppData\Local\cache\common-lisp\sbcl-1.3.6-win-x64\C\Users\jaccarmac\software\quicklisp\local-projects\cl-cuda\src\driver-api\type-grovel__grovel.c
Command as written fails because C:\Users\jaccarmac\AppData\Local\cache\common-lisp\sbcl-1.3.6-win-x64\C\Users\jaccarmac\software\quicklisp\local-projects\cl-cuda\src\driver-api\type-grovel__grovel-tmpGHU3ALSV.exe
is not a valid path.
(Note the C\
in the middle of the pathname.)
You do not have the path C:\Users\jaccarmac\AppData\Local\cache\common-lisp\sbcl-1.3.6-win-x64\C\
? Or isn't it because of escape sequences, how about this?
gcc -m64 -o C:\\Users\\jaccarmac\\AppData\\Local\\cache\\common-lisp\\sbcl-1.3.6-win-x64\\C\\Users\\jaccarmac\\software\\quicklisp\\local-projects\\cl-cuda\\src\\driver-api\\type-grovel__grovel-tmpGHU3ALSV.exe -IC:/Users/jaccarmac/software/quicklisp/dists/quicklisp/software/cffi_0.17.1/ C:\\Users\\jaccarmac\\AppData\\Local\\cache\\common-lisp\\sbcl-1.3.6-win-x64\\C\\Users\\jaccarmac\\software\\quicklisp\\local-projects\\cl-cuda\\src\\driver-api\\type-grovel__grovel.c
Aha, that was it. cuda.h
is missing from gcc's search path.
Please add the path to cuda.h
to environment variable C_INCLUDE_PATH
. Do you find cuda.h
on your environment?
That seems to work! Thanks!
I see you have a list of supported environments in the README. If you let me know how to run the test suite, I can verify it passes and submit a PR with my specifics.
Test programs or (asdf:oos 'asdf:test-op '#:cl-cuda)
both fail with an alien function cuInit is undefined
.
It helps me a lot. You can run the test with (ql:quickload :cl-cuda-test)
.
The alien function "cuInit" is undefined.
is what I'm still getting. Natively or from MSYS console, with or without changes to PATH
or C_INCLUDE_PATH
.
cuInit
should be defined via CFFI:DEFCFUN in cl-cuda/src/driver-api/function.lisp. There may be something left to call API in nvcuda.dll. Let me think a while.
Would you try it again with the following fix in cl-cuda/src/driver-api/library.lisp ?
(cffi:define-foreign-library libcuda
+ (:windows "nvcuda.dll")
+ ; (:windows "nvcuda.dll" :convention :stdcall)
(:darwin (:framework "CUDA"))
(:unix (:or "libcuda.so" "libcuda64.so")))
At least, The alien function "cuInit" is undefined.
is because of missing a line on Windows in foreign library definition, but I do not know :convention :stdcall
is required or not.
Both versions seem to work, and further into the test suite we get The function OSICAT-POSIX:MKTEMP is undefined.
.
This test also fails further up the chain.
? basic case 4
"float3_add( __make_float3( 1.0f, 1.0f, 1.0f ), __make_float3( 2.0f, 2.0f, 2.0f ) )" is expected to be "float3_add( make_float3( 1.0f, 1.0f, 1.0f ), make_float3( 2.0f, 2.0f, 2.0f ) )"
Both versions seem to work, and further into the test suite we get The function OSICAT-POSIX:MKTEMP is undefined..
OSICAT-POSIX:MKTEMP might not work on Windows, please apply this patch as working around. I use MKTEMP just for making temporary file name. I will fix it later.
cl-cuda/src/api/nvcc.lisp
(defun get-cu-path ()
+ (let ((name "cl-cuda.tmp"))
- (let ((name (format nil "cl-cuda.~A" (osicat-posix:mktemp))))
(make-pathname :name name :type "cu" :defaults (get-tmp-path))))