cl-cuda icon indicating copy to clipboard operation
cl-cuda copied to clipboard

Any way to run on Windows?

Open jaccarmac opened this issue 8 years ago • 52 comments

Setting up cl-cuda seems to hook into gcc to create the FFI. GCC is well and good thanks to MSYS2/MinGW64, but apparently the CUDA toolkit and MinGW don't play nice together. Is there any way to set up cl-cuda to use the Windows CUDA toolchain?

jaccarmac avatar Jun 23 '16 22:06 jaccarmac

I did not try cl-cuda on Windows, but I suppose that if you could satisfy the following points, cl-cuda would run on Windows even natively without MSYS/MinGW help. How about these?

  • Running NVCC on commaond line.
  • Running external commands from Common Lisp.
  • Calling libcuda.dll via CFFI.

takagi avatar Jun 23 '16 23:06 takagi

nvcc works, haven't tried it with actual input files but it is on the PATH. Will try to compile samples and see what happens.

nvcc can be run through SBCL (sb-ext:run-program "nvcc" nil :search t).

Can't find libcuda.dll on my system, even though I have a CUDA card and have installed the developer SDK. Is that a secondary dependency? I'll do some more research momentarily.

jaccarmac avatar Jun 24 '16 00:06 jaccarmac

I can find cuda.lib but no cuda.dll.

jaccarmac avatar Jun 24 '16 01:06 jaccarmac

Cupy https://github.com/pfnet/chainer/tree/master/cupy does the almost same thing with cl-cuda in Python, generating CUDA C codes, compiling them with NVCC and launching kernels, and it works on Windows as well, so it should be possible.

takagi avatar Jun 24 '16 04:06 takagi

My installation was slightly borked due to the lack of a valid Visual Studio version. That problem is fixed and my environment is actually working now, but I still can't find the right DLL(s). Haven't taken a look at exactly what cupy does yet. Here are the DLLs I can find.

cublas64_75.dll
cudart32_75.dll
cudart64_75.dll
cufft64_75.dll
cufftw64_75.dll
cuinj32_75.dll
cuinj64_75.dll
curand64_75.dll
cusolver64_75.dll
cusparse64_75.dll
nppc64_75.dll
nppi64_75.dll
npps64_75.dll
nvblas64_75.dll
nvrtc64_75.dll
nvrtc-builtins64_75.dll

jaccarmac avatar Jun 24 '16 17:06 jaccarmac

This https://developer.nvidia.com/cuda-faq says that needed to use the driver API is "nvcuda.dll" and it is included as part of the standard NVIDIA driver install. Would you find it in Windows system folders such as System32? Cl-cuda uses the driver API only.

takagi avatar Jun 24 '16 17:06 takagi

Appears to work on SBCL for me.

* (ql:quickload :cffi)
To load "cffi":
  Load 1 ASDF system:
    cffi
; Loading "cffi"
........
(:CFFI)
* (cffi:load-foreign-library "nvcuda")

#<CFFI:FOREIGN-LIBRARY NVCUDA-523 "nvcuda">

jaccarmac avatar Jun 24 '16 18:06 jaccarmac

Okay, then you should be able to load cl-cuda with nvcuda.dll.

(ql:quickload :cl-cuda)

Please set *nvcc-binary* to the path to NVCC compiler and try to run some sample programs.

(setf cl-cuda:*nvcc-binary* #P"path\to\nvcc")

(ql:quickload :cl-cuda-examples)
(cl-cuda-examples.vector-add:main)

You may need to pass some options to nvcc via *nvcc-options*, please let me know what you will get.

takagi avatar Jun 25 '16 05:06 takagi

Can't even load the system in the first place thanks to an error groveling a file in cl-cuda. The full stacktrace from SLIME, since I'm very unfamiliar with native integration in SBCL.

Couldn't execute "gcc": The system cannot find the file specified.
   [Condition of type CFFI-GROVEL:GROVEL-ERROR]

Restarts:
 0: [RETRY] Retry PROCESS-OP on #<CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">.
 1: [ACCEPT] Continue, treating PROCESS-OP on #<CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel"> as having been successful.
 2: [RETRY] Retry ASDF operation.
 3: [CLEAR-CONFIGURATION-AND-RETRY] Retry ASDF operation after resetting the configuration.
 4: [ABORT] Give up on "cl-cuda"
 5: [RETRY] Retry SLIME REPL evaluation request.
 --more--

Backtrace:
  0: (CFFI-GROVEL:GROVEL-ERROR "~a" #<SIMPLE-ERROR "Couldn't execute ~S: ~A" {1006781B33}>)
  1: ((FLET #:THUNK :IN CFFI-GROVEL:PROCESS-GROVEL-FILE))
  2: (SB-IMPL::%WITH-STANDARD-IO-SYNTAX #<CLOSURE (FLET #:THUNK :IN CFFI-GROVEL:PROCESS-GROVEL-FILE) {9F2DDBB}>)
  3: (CFFI-GROVEL:PROCESS-GROVEL-FILE #P"C:/Users/jaccarmac/software/quicklisp/local-projects/cl-cuda/src/driver-api/type-grovel.lisp" #P"C:/Users/jaccarmac/AppData/Local/cache/common-lisp/sbcl-1.3.6-win-x..
  4: ((:METHOD ASDF/ACTION:PERFORM (CFFI-GROVEL::PROCESS-OP CFFI-GROVEL:GROVEL-FILE)) #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">) [fast-method]
  5: ((SB-PCL::EMF ASDF/ACTION:PERFORM) #<unavailable argument> #<unavailable argument> #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">)
  6: ((:METHOD ASDF/ACTION:PERFORM-WITH-RESTARTS :AROUND (T T)) #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">) [fast-method]
  7: ((:METHOD ASDF/PLAN:PERFORM-PLAN (LIST)) ((#1=#<ASDF/LISP-ACTION:PREPARE-OP > . #2=#<ASDF/SYSTEM:SYSTEM "uiop">) (#<ASDF/LISP-ACTION:COMPILE-OP > . #2#) (#3=#<ASDF/LISP-ACTION:LOAD-OP > . #2#) (#1# . ..
  8: ((FLET SB-C::WITH-IT :IN SB-C::%WITH-COMPILATION-UNIT))
  9: ((:METHOD ASDF/PLAN:PERFORM-PLAN :AROUND (T)) ((#1=#<ASDF/LISP-ACTION:PREPARE-OP > . #2=#<ASDF/SYSTEM:SYSTEM "uiop">) (#<ASDF/LISP-ACTION:COMPILE-OP > . #2#) (#3=#<ASDF/LISP-ACTION:LOAD-OP > . #2#) (#..
 10: ((FLET SB-C::WITH-IT :IN SB-C::%WITH-COMPILATION-UNIT))
 11: ((:METHOD ASDF/PLAN:PERFORM-PLAN :AROUND (T)) #<ASDF/PLAN:SEQUENTIAL-PLAN {1003E29C63}> :VERBOSE NIL) [fast-method]
 12: ((:METHOD ASDF/OPERATE:OPERATE (ASDF/OPERATION:OPERATION ASDF/COMPONENT:COMPONENT)) #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL) [fast-method]
 13: ((SB-PCL::EMF ASDF/OPERATE:OPERATE) #<unused argument> #<unused argument> #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL)
 14: ((LAMBDA NIL :IN ASDF/OPERATE:OPERATE))
 15: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL) [fast-method]
 16: ((SB-PCL::EMF ASDF/OPERATE:OPERATE) #<unused argument> #<unused argument> ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL)
 17: ((LAMBDA NIL :IN ASDF/OPERATE:OPERATE))
 18: (ASDF/CACHE:CALL-WITH-ASDF-CACHE #<CLOSURE (LAMBDA NIL :IN ASDF/OPERATE:OPERATE) {1003E1B22B}> :OVERRIDE NIL :KEY NIL)
 19: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL) [fast-method]
 20: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL) [fast-method]
 21: (ASDF/OPERATE:LOAD-SYSTEM "cl-cuda" :VERBOSE NIL)
 22: (QUICKLISP-CLIENT::CALL-WITH-MACROEXPAND-PROGRESS #<CLOSURE (LAMBDA NIL :IN QUICKLISP-CLIENT::APPLY-LOAD-STRATEGY) {1003D8125B}>)
 23: (QUICKLISP-CLIENT::AUTOLOAD-SYSTEM-AND-DEPENDENCIES "cl-cuda" :PROMPT NIL)
 24: ((:METHOD QL-IMPL-UTIL::%CALL-WITH-QUIET-COMPILATION (T T)) #<unavailable argument> #<CLOSURE (FLET QUICKLISP-CLIENT::QL :IN QUICKLISP-CLIENT:QUICKLOAD) {1004559C2B}>) [fast-method]
 25: ((:METHOD QL-IMPL-UTIL::%CALL-WITH-QUIET-COMPILATION :AROUND (QL-IMPL:SBCL T)) #<QL-IMPL:SBCL {10066F0833}> #<CLOSURE (FLET QUICKLISP-CLIENT::QL :IN QUICKLISP-CLIENT:QUICKLOAD) {1004559C2B}>) [fast-me..
 26: ((:METHOD QUICKLISP-CLIENT:QUICKLOAD (T)) #<unavailable argument> :PROMPT NIL :SILENT NIL :VERBOSE NIL) [fast-method]
 27: (QL-DIST::CALL-WITH-CONSISTENT-DISTS #<CLOSURE (LAMBDA NIL :IN QUICKLISP-CLIENT:QUICKLOAD) {100453EAFB}>)
 28: (SB-INT:SIMPLE-EVAL-IN-LEXENV (QUICKLISP-CLIENT:QUICKLOAD :CL-CUDA) #<NULL-LEXENV>)
 29: (EVAL (QUICKLISP-CLIENT:QUICKLOAD :CL-CUDA))
 30: (SWANK::EVAL-REGION "(ql:quickload :cl-cuda) ..)
 31: ((LAMBDA NIL :IN SWANK-REPL::REPL-EVAL))
 32: (SWANK-REPL::TRACK-PACKAGE #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {100453E25B}>)
 33: (SWANK::CALL-WITH-RETRY-RESTART "Retry SLIME REPL evaluation request." #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {100453E1BB}>)
 34: (SWANK::CALL-WITH-BUFFER-SYNTAX NIL #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {100453E19B}>)
 35: (SWANK-REPL::REPL-EVAL "(ql:quickload :cl-cuda) ..)
 36: (SB-INT:SIMPLE-EVAL-IN-LEXENV (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
 37: (EVAL (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
 38: (SWANK:EVAL-FOR-EMACS (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
 39: (SWANK::PROCESS-REQUESTS NIL)
 40: ((LAMBDA NIL :IN SWANK::HANDLE-REQUESTS))
 41: ((LAMBDA NIL :IN SWANK::HANDLE-REQUESTS))
 42: (SWANK/SBCL::CALL-WITH-BREAK-HOOK #<FUNCTION SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA NIL :IN SWANK::HANDLE-REQUESTS) {1003DD000B}>)
 43: ((FLET SWANK/BACKEND:CALL-WITH-DEBUGGER-HOOK :IN "c:/Users/jaccarmac/.emacs.d/elpa/slime-20160614.1214/swank/sbcl.lisp") #<FUNCTION SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA NIL :IN SWANK::HANDLE-R..
 44: (SWANK::CALL-WITH-BINDINGS ((*STANDARD-INPUT* . #1=#<SWANK/GRAY::SLIME-INPUT-STREAM {1003C7EB13}>) (*STANDARD-OUTPUT* . #2=#<SWANK/GRAY::SLIME-OUTPUT-STREAM {1003D8F743}>) (*TRACE-OUTPUT* . #2#) (*ERR..
 45: (SWANK::HANDLE-REQUESTS #<SWANK::MULTITHREADED-CONNECTION {1003220523}> NIL)
 46: ((FLET #:WITHOUT-INTERRUPTS-BODY-1161 :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
 47: ((FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
 48: ((FLET #:WITHOUT-INTERRUPTS-BODY-359 :IN SB-THREAD::CALL-WITH-MUTEX))
 49: (SB-THREAD::CALL-WITH-MUTEX #<CLOSURE (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE) {9F2FB5B}> #<SB-THREAD:MUTEX "thread result lock" owner: #<SB-THREAD:THREAD "..
 50: (SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE #<SB-THREAD:THREAD "repl-thread" RUNNING {1003DC8033}> NIL #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::SPAWN-REPL-THREAD) {1003DBFF9B}> (#<SB-THREAD:THREAD "re..
 51: ("foreign function: #x42E6FC")
 52: ("foreign function: #x40334E")
 53: ("foreign function: #x8B6FE0")

jaccarmac avatar Jun 25 '16 21:06 jaccarmac

Ah... grovel... I missed you mentioned first with nvcc. While I will think of some working around, how did you failed on MSYS2/MinGW64 at frist?

but apparently the CUDA toolkit and MinGW don't play nice together.

takagi avatar Jun 26 '16 00:06 takagi

AFAICT (definitely not an expert systems programmer :-), NVIDIA distributes their dev environment as binaries, but provide .libs for MSVC instead of DLL's, which means you have to do low level lib twiddling to get them to link against MinGW's libc.

jaccarmac avatar Jun 26 '16 00:06 jaccarmac

Is it possible to call nvcuda.dll from SBCL on MinGW?

  • Running NVCC on commaond line.
  • Running external commands from Common Lisp.
  • Calling libcuda.dll via CFFI.
  • Groveling cuda.h with gcc. 

takagi avatar Jun 26 '16 01:06 takagi

I suppose that MinGW has a feature to call DLLs as well as GNU libraries, though not familiar with its calling convension.

takagi avatar Jun 26 '16 01:06 takagi

MinGW does use DLLs as its shared library format, but as I understand it they are linked to an old msvcr.dll. In any case, here are the results from running SBCL from inside a MinGW64 shell.

Subprocess (:PROCESS #<SB-IMPL::PROCESS :EXITED 1>)
 with command ("gcc" "-m64" "-o"
               "C:\\Users\\jaccarmac\\AppData\\Local\\cache\\common-lisp\\sbcl-1.3.6-win-x64\\C\\Users\\jaccarmac\\software\\quicklisp\\local-projects\\cl-cuda\\src\\driver-api\\type-grovel__grovel-tmpGHU3ALSV.exe"
               "-IC:/Users/jaccarmac/software/quicklisp/dists/quicklisp/software/cffi_0.17.1/"
               "C:\\Users\\jaccarmac\\AppData\\Local\\cache\\common-lisp\\sbcl-1.3.6-win-x64\\C\\Users\\jaccarmac\\software\\quicklisp\\local-projects\\cl-cuda\\src\\driver-api\\type-grovel__grovel.c")
 exited with error code 1
   [Condition of type CFFI-GROVEL:GROVEL-ERROR]

Restarts:
 0: [RETRY] Retry PROCESS-OP on #<CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">.
 1: [ACCEPT] Continue, treating PROCESS-OP on #<CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel"> as having been successful.
 2: [RETRY] Retry ASDF operation.
 3: [CLEAR-CONFIGURATION-AND-RETRY] Retry ASDF operation after resetting the configuration.
 4: [ABORT] Give up on "cl-cuda"
 5: [RETRY] Retry SLIME REPL evaluation request.
 --more--

Backtrace:
  0: (CFFI-GROVEL:GROVEL-ERROR "~a" #<UIOP/RUN-PROGRAM:SUBPROCESS-ERROR {100614BC93}>)
  1: ((FLET #:THUNK :IN CFFI-GROVEL:PROCESS-GROVEL-FILE))
  2: (SB-IMPL::%WITH-STANDARD-IO-SYNTAX #<CLOSURE (FLET #:THUNK :IN CFFI-GROVEL:PROCESS-GROVEL-FILE) {9EEDDBB}>)
  3: (CFFI-GROVEL:PROCESS-GROVEL-FILE #P"C:/Users/jaccarmac/software/quicklisp/local-projects/cl-cuda/src/driver-api/type-grovel.lisp" #P"C:/Users/jaccarmac/AppData/Local/cache/common-lisp/sbcl-1.3.6-win-x..
  4: ((:METHOD ASDF/ACTION:PERFORM (CFFI-GROVEL::PROCESS-OP CFFI-GROVEL:GROVEL-FILE)) #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">) [fast-method]
  5: ((SB-PCL::EMF ASDF/ACTION:PERFORM) #<unavailable argument> #<unavailable argument> #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">)
  6: ((:METHOD ASDF/ACTION:PERFORM-WITH-RESTARTS :AROUND (T T)) #<CFFI-GROVEL::PROCESS-OP > #<CL-CUDA-ASD::CUDA-GROVEL-FILE "cl-cuda" "src" "driver-api" "type-grovel">) [fast-method]
  7: ((:METHOD ASDF/PLAN:PERFORM-PLAN (LIST)) ((#1=#<ASDF/LISP-ACTION:PREPARE-OP > . #2=#<ASDF/SYSTEM:SYSTEM "uiop">) (#<ASDF/LISP-ACTION:COMPILE-OP > . #2#) (#3=#<ASDF/LISP-ACTION:LOAD-OP > . #2#) (#1# . ..
  8: ((FLET SB-C::WITH-IT :IN SB-C::%WITH-COMPILATION-UNIT))
  9: ((:METHOD ASDF/PLAN:PERFORM-PLAN :AROUND (T)) ((#1=#<ASDF/LISP-ACTION:PREPARE-OP > . #2=#<ASDF/SYSTEM:SYSTEM "uiop">) (#<ASDF/LISP-ACTION:COMPILE-OP > . #2#) (#3=#<ASDF/LISP-ACTION:LOAD-OP > . #2#) (#..
 10: ((FLET SB-C::WITH-IT :IN SB-C::%WITH-COMPILATION-UNIT))
 11: ((:METHOD ASDF/PLAN:PERFORM-PLAN :AROUND (T)) #<ASDF/PLAN:SEQUENTIAL-PLAN {1003781C63}> :VERBOSE NIL) [fast-method]
 12: ((:METHOD ASDF/OPERATE:OPERATE (ASDF/OPERATION:OPERATION ASDF/COMPONENT:COMPONENT)) #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL) [fast-method]
 13: ((SB-PCL::EMF ASDF/OPERATE:OPERATE) #<unused argument> #<unused argument> #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL)
 14: ((LAMBDA NIL :IN ASDF/OPERATE:OPERATE))
 15: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) #<ASDF/LISP-ACTION:LOAD-OP :VERBOSE NIL> #<ASDF/SYSTEM:SYSTEM "cl-cuda"> :VERBOSE NIL) [fast-method]
 16: ((SB-PCL::EMF ASDF/OPERATE:OPERATE) #<unused argument> #<unused argument> ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL)
 17: ((LAMBDA NIL :IN ASDF/OPERATE:OPERATE))
 18: (ASDF/CACHE:CALL-WITH-ASDF-CACHE #<CLOSURE (LAMBDA NIL :IN ASDF/OPERATE:OPERATE) {100377322B}> :OVERRIDE NIL :KEY NIL)
 19: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL) [fast-method]
 20: ((:METHOD ASDF/OPERATE:OPERATE :AROUND (T T)) ASDF/LISP-ACTION:LOAD-OP "cl-cuda" :VERBOSE NIL) [fast-method]
 21: (ASDF/OPERATE:LOAD-SYSTEM "cl-cuda" :VERBOSE NIL)
 22: (QUICKLISP-CLIENT::CALL-WITH-MACROEXPAND-PROGRESS #<CLOSURE (LAMBDA NIL :IN QUICKLISP-CLIENT::APPLY-LOAD-STRATEGY) {100371125B}>)
 23: (QUICKLISP-CLIENT::AUTOLOAD-SYSTEM-AND-DEPENDENCIES "cl-cuda" :PROMPT NIL)
 24: ((:METHOD QL-IMPL-UTIL::%CALL-WITH-QUIET-COMPILATION (T T)) #<unavailable argument> #<CLOSURE (FLET QUICKLISP-CLIENT::QL :IN QUICKLISP-CLIENT:QUICKLOAD) {1003DE55FB}>) [fast-method]
 25: ((:METHOD QL-IMPL-UTIL::%CALL-WITH-QUIET-COMPILATION :AROUND (QL-IMPL:SBCL T)) #<QL-IMPL:SBCL {10066F0833}> #<CLOSURE (FLET QUICKLISP-CLIENT::QL :IN QUICKLISP-CLIENT:QUICKLOAD) {1003DE55FB}>) [fast-me..
 26: ((:METHOD QUICKLISP-CLIENT:QUICKLOAD (T)) #<unavailable argument> :PROMPT NIL :SILENT NIL :VERBOSE NIL) [fast-method]
 27: (QL-DIST::CALL-WITH-CONSISTENT-DISTS #<CLOSURE (LAMBDA NIL :IN QUICKLISP-CLIENT:QUICKLOAD) {1003DC330B}>)
 28: (SB-INT:SIMPLE-EVAL-IN-LEXENV (QUICKLISP-CLIENT:QUICKLOAD :CL-CUDA) #<NULL-LEXENV>)
 29: (EVAL (QUICKLISP-CLIENT:QUICKLOAD :CL-CUDA))
 30: (SWANK::EVAL-REGION "(ql:quickload :cl-cuda) ..)
 31: ((LAMBDA NIL :IN SWANK-REPL::REPL-EVAL))
 32: (SWANK-REPL::TRACK-PACKAGE #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {1003DC2A6B}>)
 33: (SWANK::CALL-WITH-RETRY-RESTART "Retry SLIME REPL evaluation request." #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {1003DC29CB}>)
 34: (SWANK::CALL-WITH-BUFFER-SYNTAX NIL #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) {1003DC29AB}>)
 35: (SWANK-REPL::REPL-EVAL "(ql:quickload :cl-cuda) ..)
 36: (SB-INT:SIMPLE-EVAL-IN-LEXENV (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
 37: (EVAL (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
 38: (SWANK:EVAL-FOR-EMACS (SWANK-REPL:LISTENER-EVAL "(ql:quickload :cl-cuda) ..)
 39: (SWANK::PROCESS-REQUESTS NIL)
 40: ((LAMBDA NIL :IN SWANK::HANDLE-REQUESTS))
 41: ((LAMBDA NIL :IN SWANK::HANDLE-REQUESTS))
 42: (SWANK/SBCL::CALL-WITH-BREAK-HOOK #<FUNCTION SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA NIL :IN SWANK::HANDLE-REQUESTS) {1003DC000B}>)
 43: ((FLET SWANK/BACKEND:CALL-WITH-DEBUGGER-HOOK :IN "c:/Users/jaccarmac/.emacs.d/elpa/slime-20160614.1214/swank/sbcl.lisp") #<FUNCTION SWANK:SWANK-DEBUGGER-HOOK> #<CLOSURE (LAMBDA NIL :IN SWANK::HANDLE-R..
 44: (SWANK::CALL-WITH-BINDINGS ((*STANDARD-INPUT* . #1=#<SWANK/GRAY::SLIME-INPUT-STREAM {1003C76B13}>) (*STANDARD-OUTPUT* . #2=#<SWANK/GRAY::SLIME-OUTPUT-STREAM {1003D87DF3}>) (*TRACE-OUTPUT* . #2#) (*ERR..
 45: (SWANK::HANDLE-REQUESTS #<SWANK::MULTITHREADED-CONNECTION {1003220523}> NIL)
 46: ((FLET #:WITHOUT-INTERRUPTS-BODY-1161 :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
 47: ((FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE))
 48: ((FLET #:WITHOUT-INTERRUPTS-BODY-359 :IN SB-THREAD::CALL-WITH-MUTEX))
 49: (SB-THREAD::CALL-WITH-MUTEX #<CLOSURE (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE) {9EEFB5B}> #<SB-THREAD:MUTEX "thread result lock" owner: #<SB-THREAD:THREAD "..
 50: (SB-THREAD::INITIAL-THREAD-FUNCTION-TRAMPOLINE #<SB-THREAD:THREAD "repl-thread" RUNNING {1003DB8033}> NIL #<CLOSURE (LAMBDA NIL :IN SWANK-REPL::SPAWN-REPL-THREAD) {1003DB7F9B}> (#<SB-THREAD:THREAD "re..
 51: ("foreign function: #x42E6FC")
 52: ("foreign function: #x40334E")
 53: ("foreign function: #x2637DA0")

jaccarmac avatar Jun 26 '16 02:06 jaccarmac

What does gcc return if directly executed? You would be able to find some error messages.

gcc -m64 -o C:\Users\jaccarmac\AppData\Local\cache\common-lisp\sbcl-1.3.6-win-x64\C\Users\jaccarmac\software\quicklisp\local-projects\cl-cuda\src\driver-api\type-grovel__grovel-tmpGHU3ALSV.exe -IC:/Users/jaccarmac/software/quicklisp/dists/quicklisp/software/cffi_0.17.1/ C:\Users\jaccarmac\AppData\Local\cache\common-lisp\sbcl-1.3.6-win-x64\C\Users\jaccarmac\software\quicklisp\local-projects\cl-cuda\src\driver-api\type-grovel__grovel.c

takagi avatar Jun 26 '16 03:06 takagi

Command as written fails because C:\Users\jaccarmac\AppData\Local\cache\common-lisp\sbcl-1.3.6-win-x64\C\Users\jaccarmac\software\quicklisp\local-projects\cl-cuda\src\driver-api\type-grovel__grovel-tmpGHU3ALSV.exe is not a valid path.

jaccarmac avatar Jun 26 '16 04:06 jaccarmac

(Note the C\ in the middle of the pathname.)

jaccarmac avatar Jun 26 '16 04:06 jaccarmac

You do not have the path C:\Users\jaccarmac\AppData\Local\cache\common-lisp\sbcl-1.3.6-win-x64\C\? Or isn't it because of escape sequences, how about this?

gcc -m64 -o C:\\Users\\jaccarmac\\AppData\\Local\\cache\\common-lisp\\sbcl-1.3.6-win-x64\\C\\Users\\jaccarmac\\software\\quicklisp\\local-projects\\cl-cuda\\src\\driver-api\\type-grovel__grovel-tmpGHU3ALSV.exe -IC:/Users/jaccarmac/software/quicklisp/dists/quicklisp/software/cffi_0.17.1/ C:\\Users\\jaccarmac\\AppData\\Local\\cache\\common-lisp\\sbcl-1.3.6-win-x64\\C\\Users\\jaccarmac\\software\\quicklisp\\local-projects\\cl-cuda\\src\\driver-api\\type-grovel__grovel.c

takagi avatar Jun 26 '16 05:06 takagi

Aha, that was it. cuda.h is missing from gcc's search path.

jaccarmac avatar Jun 26 '16 06:06 jaccarmac

Please add the path to cuda.h to environment variable C_INCLUDE_PATH. Do you find cuda.h on your environment?

takagi avatar Jun 26 '16 08:06 takagi

That seems to work! Thanks!

jaccarmac avatar Jun 26 '16 08:06 jaccarmac

I see you have a list of supported environments in the README. If you let me know how to run the test suite, I can verify it passes and submit a PR with my specifics.

jaccarmac avatar Jun 26 '16 08:06 jaccarmac

Test programs or (asdf:oos 'asdf:test-op '#:cl-cuda) both fail with an alien function cuInit is undefined.

jaccarmac avatar Jun 26 '16 08:06 jaccarmac

It helps me a lot. You can run the test with (ql:quickload :cl-cuda-test) .

takagi avatar Jun 26 '16 08:06 takagi

The alien function "cuInit" is undefined. is what I'm still getting. Natively or from MSYS console, with or without changes to PATH or C_INCLUDE_PATH.

jaccarmac avatar Jun 26 '16 08:06 jaccarmac

cuInit should be defined via CFFI:DEFCFUN in cl-cuda/src/driver-api/function.lisp. There may be something left to call API in nvcuda.dll. Let me think a while.

takagi avatar Jun 26 '16 09:06 takagi

Would you try it again with the following fix in cl-cuda/src/driver-api/library.lisp ?

  (cffi:define-foreign-library libcuda
+   (:windows "nvcuda.dll")
+   ; (:windows "nvcuda.dll" :convention :stdcall)
    (:darwin (:framework "CUDA"))
    (:unix (:or "libcuda.so" "libcuda64.so")))

At least, The alien function "cuInit" is undefined. is because of missing a line on Windows in foreign library definition, but I do not know :convention :stdcall is required or not.

takagi avatar Jun 26 '16 09:06 takagi

Both versions seem to work, and further into the test suite we get The function OSICAT-POSIX:MKTEMP is undefined..

jaccarmac avatar Jun 26 '16 09:06 jaccarmac

This test also fails further up the chain.

 ? basic case 4
    "float3_add( __make_float3( 1.0f, 1.0f, 1.0f ), __make_float3( 2.0f, 2.0f, 2.0f ) )" is expected to be "float3_add( make_float3( 1.0f, 1.0f, 1.0f ), make_float3( 2.0f, 2.0f, 2.0f ) )"

jaccarmac avatar Jun 26 '16 09:06 jaccarmac

Both versions seem to work, and further into the test suite we get The function OSICAT-POSIX:MKTEMP is undefined..

OSICAT-POSIX:MKTEMP might not work on Windows, please apply this patch as working around. I use MKTEMP just for making temporary file name. I will fix it later.

cl-cuda/src/api/nvcc.lisp

  (defun get-cu-path ()
+   (let ((name "cl-cuda.tmp"))
-   (let ((name (format nil "cl-cuda.~A" (osicat-posix:mktemp))))
      (make-pathname :name name :type "cu" :defaults (get-tmp-path))))

takagi avatar Jun 26 '16 10:06 takagi