ILGPU icon indicating copy to clipboard operation
ILGPU copied to clipboard

[BUG]: cublasIsamax_v2

Open Ruberik opened this issue 1 year ago • 0 comments

Describe the bug

I get the following crash when I attempt to use Amax(ArrayView1D<float, Stride1D.General> input, ArrayView output):

Fatal error. System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception.
Repeat 2 times:
--------------------------------
   at ILGPU.Runtime.Cuda.API.CuBlasAPI_Windows_V11.cublasIsamax_v2(IntPtr, Int32, Void*, Int32, Void*)
--------------------------------
   at ILGPU.Runtime.Cuda.API.CuBlasAPI_Windows_V11.Isamax_v2(IntPtr, Int32, Void*, Int32, Void*)
   at ILGPU.Runtime.Cuda.CuBlas`1[[ILGPU.Runtime.Cuda.CuBlasPointerModeHandlers+ManualMode, ILGPU.Algorithms, Version=1.5.1.0, Culture=neutral, PublicKeyToken=null]].Amax(ILGPU.Runtime.ArrayView1D`2<Single,General>, ILGPU.ArrayView`1<Int32>)
   at Gotham.FurrowUtilities.BSGpu.ComputeTargetSpotStrikeRatio(Config, PriceBucketConfig, ILGPU.Runtime.MemoryBuffer1D`2<Double,Dense>)
   at Gotham.FurrowUtilities.BSGpu.ValueCallOption(Config, Int32, Boolean)
   at Gotham.FurrowUtilities.Reloading.Main(System.String[])

I suspect the issue may be that my ArrayView is allocated GPU-side, and that perhaps isn't allowed by Cublas.

nvidia-smi output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 511.65       Driver Version: 511.65       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            TCC  | 00000000:3B:00.0 Off |                  Off |
| N/A   37C    P8     9W /  70W |      1MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Environment

  • ILGPU version: 1.5.1
  • .NET version: .NET 8
  • Operating system: Windows Server 2016 Standard
  • Hardware (if GPU-related): Tesla T4
  • Cublas: 6.14.11.1192

Steps to reproduce

var generalTemp = Accelerator.Allocate1D<float, Stride1D.General>(1024, new Stride1D.General());
var generalTarget = Accelerator.Allocate1D<int, Stride1D.General>(1, new Stride1D.General());
Blas.Amax(generalTemp, generalTarget.AsArrayView<int>(0, 1));

Expected behavior

An int is placed into generalTarget.

Instead there's a crash, as described above.

Additional context

No response

Ruberik avatar Aug 12 '24 22:08 Ruberik