cuDLA-samples icon indicating copy to clipboard operation
cuDLA-samples copied to clipboard

How to profile cuDLA computation

Open angry-crab opened this issue 11 months ago • 3 comments

Hi, I tried to profile DLA according to this tutorial. https://github.com/NVIDIA-AI-IOT/jetson_dla_tutorial

But I got Error[1]: [runtime.cpp::parsePlan::314] Error Code 1: Serialization (Serialization assertion plan->header.magicTag == rt::kPLAN_MAGIC_TAG failed.)

It seems that TensorRT cannot serialized the loadable somehow. Some posts said this was because of mismatch of TensorRT versions, but I was using the same TensorRT for building and inferring.

Therefore, I was wondering if there is a way to profile cuDLA. Thanks.

angry-crab avatar Mar 21 '24 04:03 angry-crab

Hi @angry-crab, TensorRT can only build the loadable, but is unable to load it. We should use cuDLA API to load and execute it, cuDLA samples can be found in https://github.com/NVIDIA/cuda-samples/tree/master/Samples/4_CUDA_Libraries/cuDLAHybridMode and https://github.com/NVIDIA/cuda-samples/tree/master/Samples/4_CUDA_Libraries/cuDLAStandaloneMode

lynettez avatar Mar 25 '24 03:03 lynettez

Hi @angry-crab, TensorRT can only build the loadable, but is unable to load it. We should use cuDLA API to load and execute it, cuDLA samples can be found in https://github.com/NVIDIA/cuda-samples/tree/master/Samples/4_CUDA_Libraries/cuDLAHybridMode and https://github.com/NVIDIA/cuda-samples/tree/master/Samples/4_CUDA_Libraries/cuDLAStandaloneMode

Hi, thank you for the info. However, I would like to profile cuDLA internal computations, such matmul, conv, etc. Is there a way to do that?

angry-crab avatar Mar 28 '24 05:03 angry-crab

sorry for the late reply. @angry-crab here are the samples that used to provide layerwise statistics to the application. https://github.com/NVIDIA/Deep-Learning-Accelerator-SW/tree/main/samples/cuDLA Please check if cudlaExternalEtbl.hpp is available on your platform. Layer-wise profiling is a new feature that may not be supported on some older platforms.

lynettez avatar Sep 02 '24 06:09 lynettez

@lynettez https://github.com/NVIDIA/Deep-Learning-Accelerator-SW/issues/27
how to view DLA utilization rate ?

lix19937 avatar Oct 15 '24 12:10 lix19937