ILGPU
ILGPU copied to clipboard
[BUG]: cublasIsamax_v2
Describe the bug
I get the following crash when I attempt to use Amax(ArrayView1D<float, Stride1D.General> input, ArrayView
Fatal error. System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception.
Repeat 2 times:
--------------------------------
at ILGPU.Runtime.Cuda.API.CuBlasAPI_Windows_V11.cublasIsamax_v2(IntPtr, Int32, Void*, Int32, Void*)
--------------------------------
at ILGPU.Runtime.Cuda.API.CuBlasAPI_Windows_V11.Isamax_v2(IntPtr, Int32, Void*, Int32, Void*)
at ILGPU.Runtime.Cuda.CuBlas`1[[ILGPU.Runtime.Cuda.CuBlasPointerModeHandlers+ManualMode, ILGPU.Algorithms, Version=1.5.1.0, Culture=neutral, PublicKeyToken=null]].Amax(ILGPU.Runtime.ArrayView1D`2<Single,General>, ILGPU.ArrayView`1<Int32>)
at Gotham.FurrowUtilities.BSGpu.ComputeTargetSpotStrikeRatio(Config, PriceBucketConfig, ILGPU.Runtime.MemoryBuffer1D`2<Double,Dense>)
at Gotham.FurrowUtilities.BSGpu.ValueCallOption(Config, Int32, Boolean)
at Gotham.FurrowUtilities.Reloading.Main(System.String[])
I suspect the issue may be that my ArrayView is allocated GPU-side, and that perhaps isn't allowed by Cublas.
nvidia-smi output:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 511.65 Driver Version: 511.65 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 TCC | 00000000:3B:00.0 Off | Off |
| N/A 37C P8 9W / 70W | 1MiB / 16384MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Environment
- ILGPU version: 1.5.1
- .NET version: .NET 8
- Operating system: Windows Server 2016 Standard
- Hardware (if GPU-related): Tesla T4
- Cublas: 6.14.11.1192
Steps to reproduce
var generalTemp = Accelerator.Allocate1D<float, Stride1D.General>(1024, new Stride1D.General());
var generalTarget = Accelerator.Allocate1D<int, Stride1D.General>(1, new Stride1D.General());
Blas.Amax(generalTemp, generalTarget.AsArrayView<int>(0, 1));
Expected behavior
An int is placed into generalTarget.
Instead there's a crash, as described above.
Additional context
No response