cudnn-frontend
cudnn-frontend copied to clipboard
Matmul test failure
I encountered a test failure after building and running the tests. Here are the details:
- GPU: RTX 4090
- Repo branch: v1.4.0
- Operating System: Ubuntu 22.04.3
- CUDA version: 12.2
- cuDNN version: 8.9.7
- g++version: 11.4.0
I followed the build instructions as provided in the README:
mkdir build
cd build
cmake ..
make -j8
Output is:
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found CUDAToolkit: /usr/local/cuda-12.2/targets/x86_64-linux/include (found version "12.2.140")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Performing Test HAVE_FLAG__ffile_prefix_map__nvme2_medsam_cuda_mode_cudnn_frontend_build__deps_catch2_src__
-- Performing Test HAVE_FLAG__ffile_prefix_map__nvme2_medsam_cuda_mode_cudnn_frontend_build__deps_catch2_src__ - Success
-- cudnn found at /usr/local/cuda-12.2/lib64/libcudnn.so.
-- Found LIBRARY: /usr/local/cuda-12.2/include
-- cuDNN: /usr/local/cuda-12.2/lib64/libcudnn.so
-- cuDNN: /usr/local/cuda-12.2/include
-- cudnn_adv_infer found at /usr/local/cuda-12.2/lib64/libcudnn_adv_infer.so.
-- cudnn_adv_train found at /usr/local/cuda-12.2/lib64/libcudnn_adv_train.so.
-- cudnn_cnn_infer found at /usr/local/cuda-12.2/lib64/libcudnn_cnn_infer.so.
-- cudnn_cnn_train found at /usr/local/cuda-12.2/lib64/libcudnn_cnn_train.so.
-- cudnn_ops_infer found at /usr/local/cuda-12.2/lib64/libcudnn_ops_infer.so.
-- cudnn_ops_train found at /usr/local/cuda-12.2/lib64/libcudnn_ops_train.so.
-- cudnn found at /usr/local/cuda-12.2/lib64/libcudnn.so.
-- cuDNN: /usr/local/cuda-12.2/lib64/libcudnn.so
-- cuDNN: /usr/local/cuda-12.2/include
-- cudnn_adv_infer found at /usr/local/cuda-12.2/lib64/libcudnn_adv_infer.so.
-- cudnn_adv_train found at /usr/local/cuda-12.2/lib64/libcudnn_adv_train.so.
-- cudnn_cnn_infer found at /usr/local/cuda-12.2/lib64/libcudnn_cnn_infer.so.
-- cudnn_cnn_train found at /usr/local/cuda-12.2/lib64/libcudnn_cnn_train.so.
-- cudnn_ops_infer found at /usr/local/cuda-12.2/lib64/libcudnn_ops_infer.so.
-- cudnn_ops_train found at /usr/local/cuda-12.2/lib64/libcudnn_ops_train.so.
-- Configuring done (6.0s)
-- Generating done (0.0s)
-- Build files have been written to: /nvme2/medsam/cuda-mode/cudnn-frontend/build
[100%] Linking CXX executable ../bin/samples
Warning: Unused direct dependencies:
/usr/local/cuda-12.2/lib64/libnvrtc.so.12
/usr/local/cuda-12.2/lib64/libnvrtc-builtins.so.12.2
/lib/x86_64-linux-gnu/libcuda.so.1
/usr/local/cuda-12.2/lib64/libnvJitLink.so.12
/usr/local/cuda-12.2/lib64/libcudnn_adv_train.so.8
/usr/local/cuda-12.2/lib64/libcudnn_ops_train.so.8
/usr/local/cuda-12.2/lib64/libcudnn_cnn_train.so.8
/usr/local/cuda-12.2/lib64/libcudnn_adv_infer.so.8
/usr/local/cuda-12.2/lib64/libcudnn_cnn_infer.so.8
/usr/local/cuda-12.2/lib64/libcudnn_ops_infer.so.8
[100%] Built target samples
Then I run the matmul test
CUDNN_FRONTEND_LOG_FILE=stdout CUDNN_FRONTEND_LOG_INFO=1 ./build/bin/samples MatMul
Output is:
Filters: "MatMul"
Randomness seeded to: 1045110732
[cudnn_frontend] INFO: Validating matmul node GEMM...
[cudnn_frontend] INFO: Inferrencing properties for matmul node GEMM...
[cudnn_frontend] INFO: Creating cudnn tensors for node named 'GEMM':
[cudnn_frontend] INFO: CUDNN_BACKEND_TENSOR_DESCRIPTOR : Datatype: ["BFLOAT16"] Id: 2 nDims 3 VectorCount: 1 vectorDimension -1 Dim [ 16,32,128 ] Str [ 4096,128,1 ] isVirtual: 0 isByValue: 0 Alignment: 16 reorder_type: ["NONE"]
[cudnn_frontend] INFO: CUDNN_BACKEND_TENSOR_DESCRIPTOR : Datatype: ["BFLOAT16"] Id: 3 nDims 3 VectorCount: 1 vectorDimension -1 Dim [ 16,128,64 ] Str [ 8192,64,1 ] isVirtual: 0 isByValue: 0 Alignment: 16 reorder_type: ["NONE"]
[cudnn_frontend] INFO: CUDNN_BACKEND_TENSOR_DESCRIPTOR : Datatype: ["FLOAT"] Id: 4 nDims 3 VectorCount: 1 vectorDimension -1 Dim [ 16,32,64 ] Str [ 2048,64,1 ] isVirtual: 0 isByValue: 0 Alignment: 16 reorder_type: ["NONE"]
[cudnn_frontend] INFO: Building MatmulNode operations GEMM...
[cudnn_frontend] CUDNN_BACKEND_MATMUL_DESCRIPTOR : Math precision ["FLOAT"]
[cudnn_frontend] CUDNN_BACKEND_OPERATIONGRAPH_DESCRIPTOR has 1operations.
Tag: Matmul_
[cudnn_frontend] INFO: Getting plan from heuristics for Matmul_ ...
[cudnn_frontend] CUDNN_BACKEND_ENGINEHEUR_DESCRIPTOR :
Heuristic Mode 3 has 6 configurations
[cudnn_frontend] INFO: get_heuristics_list statuses: CUDNN_STATUS_SUCCESS
[cudnn_frontend] INFO: config list has 6 configurations.
[cudnn_frontend] INFO: config list has 6 good configurations.
[cudnn_frontend] INFO: Extracting engine configs.
[cudnn_frontend] INFO: Querying engine config properties
[cudnn_frontend] ERROR: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED. ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] because plan building failed at /nvme2/medsam/cuda-mode/cudnn-frontend/include/cudnn_frontend/plans.h:179
[cudnn_frontend] INFO: Building plan at index 0 gave ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] with message: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED
[cudnn_frontend] ERROR: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED. ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] because plan building failed at /nvme2/medsam/cuda-mode/cudnn-frontend/include/cudnn_frontend/plans.h:179
[cudnn_frontend] INFO: Building plan at index 1 gave ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] with message: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED
[cudnn_frontend] ERROR: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED. ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] because plan building failed at /nvme2/medsam/cuda-mode/cudnn-frontend/include/cudnn_frontend/plans.h:179
[cudnn_frontend] INFO: Building plan at index 2 gave ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] with message: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED
[cudnn_frontend] ERROR: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED. ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] because plan building failed at /nvme2/medsam/cuda-mode/cudnn-frontend/include/cudnn_frontend/plans.h:179
[cudnn_frontend] INFO: Building plan at index 3 gave ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] with message: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED
[cudnn_frontend] ERROR: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED. ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] because plan building failed at /nvme2/medsam/cuda-mode/cudnn-frontend/include/cudnn_frontend/plans.h:179
[cudnn_frontend] INFO: Building plan at index 4 gave ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] with message: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED
[cudnn_frontend] ERROR: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED. ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] because plan building failed at /nvme2/medsam/cuda-mode/cudnn-frontend/include/cudnn_frontend/plans.h:179
[cudnn_frontend] INFO: Building plan at index 5 gave ["GRAPH_EXECUTION_PLAN_CREATION_FAILED"] with message: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_EXECUTION_FAILED
[cudnn_frontend] ERROR: plans.check_support(h) at /nvme2/medsam/cuda-mode/cudnn-frontend/include/cudnn_frontend/graph_interface.h:260
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
samples is a Catch2 v3.3.2 host application.
Run with -? for options
-------------------------------------------------------------------------------
Matmul
-------------------------------------------------------------------------------
/nvme2/medsam/cuda-mode/cudnn-frontend/samples/cpp/matmuls.cpp:31
...............................................................................
/nvme2/medsam/cuda-mode/cudnn-frontend/samples/cpp/matmuls.cpp:80: FAILED:
REQUIRE( graph.check_support(handle).is_good() )
with expansion:
false
===============================================================================
test cases: 1 | 0 passed | 1 failed
assertions: 11 | 10 passed | 1 failed