ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

ONEDNN_VERBOSE does not work

Open HLneoh opened this issue 1 year ago • 3 comments

Is there a method to trace the execution of oneDNN API calls while executing inference with BigDL-LLM?

HLneoh avatar Feb 16 '24 17:02 HLneoh

oneDNN has verbose mode exactly for this purpose. Unless it's disabled at build time you can run your inference workload with export DNNL_VERBOSE=1 to get the trace.

vpirogov avatar Feb 16 '24 18:02 vpirogov

Yes, I have executed the inference using DNNL_VERBOSE=1 python inference.py but did not observe any trace of execution. I am wonder whether BigDL uses a different method to get the trace.

HLneoh avatar Feb 17 '24 03:02 HLneoh

I have executed the inference and observe the trace of execution in Intel GPU. First you should ensure you have prepared your machine following this guide I used chatglm3 to inference . When I activate conda env, I run these commands: Enable OneAPI and other settings for GPU support

source bigdl-llm-init -g

But in default, ONEDNN_VERBOSE has not been set by bigdl-llm-init, so I set it manually.

export ONEDNN_VERBOSE=all # https://oneapi-src.github.io/oneDNN/dev_guide_verbose.html#:~:text=enables%20all%20above%20flags

Then I run inference

python ./generate.py --repo-id-or-model-path /mnt/disk1/models/chatglm3-6b

The trace of execution:

2024-02-19 13:27:06,588 - root - INFO - intel_extension_for_pytorch auto imported
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:17<00:00,  2.47s/it]
2024-02-19 13:27:23,943 - bigdl.llm.transformers.utils - INFO - Converting the current model to sym_int4 format......
onednn_verbose,info,oneDNN v3.3.0 (commit 887fb044ccd6308ed1780a3863c2c6f5772c94b3)
onednn_verbose,info,cpu,runtime:threadpool,nthr:16
onednn_verbose,info,cpu,isa:Intel AVX2 with Intel DL Boost
onednn_verbose,info,gpu,runtime:DPC++
onednn_verbose,info,gpu,engine,0,backend:Level Zero,name:Intel(R) Arc(TM) A770 Graphics,driver_version:1.3.27191,binary_kernels:enabled
onednn_verbose,info,graph,backend,0:dnnl_backend
onednn_verbose,info,experimental features are enabled
onednn_verbose,info,use batch_normalization stats one pass is enabled
onednn_verbose,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,backend,exec_time
onednn_verbose,info,gpu,gemm,consider:32x32,4x8x1,score:22411504.572694
onednn_verbose,info,gpu,gemm,consider:32x32,4x8x1,score:22411504.572694
onednn_verbose,info,gpu,gemm,consider:16x16,8x4x1,score:9911940.480222
onednn_verbose,info,gpu,gemm,consider:32x8,2x8x2,score:6450092.134270
onednn_verbose,info,gpu,gemm,consider:32x16,2x4x8,score:8817306.292282
onednn_verbose,info,gpu,gemm,consider:16x8,2x4x4,score:4109264.110664
onednn_verbose,info,gpu,gemm,consider:16x4,2x8x4,score:3994569.622959
onednn_verbose,info,gpu,gemm,consider:16x8,4x8x1,score:6587470.454232
onednn_verbose,info,gpu,gemm,consider:16x4,2x8x1,score:5483570.658622
onednn_verbose,info,gpu,gemm,consider:16x1,4x1x1,score:9841449.488000
onednn_verbose,info,gpu,gemm,consider:8x1,4x1x4,score:14258794.173626
onednn_verbose,info,gpu,gemm,consider:8x1,2x1x16,score:31887704.135040
onednn_verbose,info,gpu,gemm,consider:32x32,4x8x1,score:22411504.572694
onednn_verbose,info,gpu,gemm,consider:32x32,4x8x1,score:22411504.572694
onednn_verbose,info,gpu,gemm,consider:16x16,8x4x1,score:9911940.480222
onednn_verbose,info,gpu,gemm,consider:32x8,2x8x2,score:6450092.134270
onednn_verbose,info,gpu,gemm,consider:32x16,2x4x8,score:8817306.292282
onednn_verbose,info,gpu,gemm,consider:16x8,2x4x4,score:4109264.110664
onednn_verbose,info,gpu,gemm,consider:16x4,2x8x4,score:3994569.622959
onednn_verbose,info,gpu,gemm,consider:16x8,4x8x1,score:6587470.454232
onednn_verbose,info,gpu,gemm,consider:16x4,2x8x1,score:5483570.658622
onednn_verbose,info,gpu,gemm,consider:16x1,4x1x1,score:9841449.488000
onednn_verbose,info,gpu,gemm,consider:8x1,4x1x4,score:14258794.173626
onednn_verbose,info,gpu,gemm,consider:8x1,2x1x16,score:31887704.135040
onednn_verbose,info,gpu,E gemm SSS TNN 16 4 aB4x2 aB4x2 aB wg 2x8x4 kr kc4 cab4 ks8 nse di bo bk0 sm sn sr dm
onednn_verbose,primitive,create_nested:cache_miss,gpu,gemm,jit:gemm:any,undef,src_a_f32::blocked:bac::f0 src_b_f32::blocked:acb::f0 dst_f32::blocked:abc::f0,,,32x18x128:32x128x18,18.0811
onednn_verbose,primitive,create:cache_miss,gpu,matmul,jit:gemm:any,undef,src_f32::blocked:bac::f0 wei_f32::blocked:acb::f0 dst_f32::blocked:abc::f0,attr-scratchpad:user ,,32x18x128:32x128x18,21.8518
onednn_verbose,primitive,exec:check,primitive,unused primitive execution argument (80),src/common/primitive_exec_types.cpp:74
onednn_verbose,primitive,exec,gpu,matmul,jit:gemm:any,undef,src_f32::blocked:bac::f0 wei_f32::blocked:acb::f0 dst_f32::blocked:abc::f0,attr-scratchpad:user ,,32x18x128:32x128x18,0.292969
onednn_verbose,info,gpu,gemm,consider:16x16,8x4x1,score:3680416.698880
onednn_verbose,info,gpu,gemm,consider:32x8,4x4x2,score:3700671.028160
onednn_verbose,info,gpu,gemm,consider:16x8,4x4x4,score:4982393.981202
onednn_verbose,info,gpu,gemm,consider:16x8,2x4x4,score:4863343.393024
onednn_verbose,info,gpu,gemm,consider:16x4,2x8x4,score:7501812.513331
onednn_verbose,info,gpu,gemm,consider:16x8,8x4x1,score:3617596.905280
onednn_verbose,info,gpu,gemm,consider:16x8,4x8x1,score:3996614.456960
onednn_verbose,info,gpu,gemm,consider:16x8,4x4x1,score:3666412.015360
onednn_verbose,info,gpu,gemm,consider:16x4,2x8x1,score:3797270.341120
onednn_verbose,info,gpu,gemm,consider:32x1,4x1x1,score:3854629.275264
onednn_verbose,info,gpu,gemm,consider:16x1,4x1x8,score:21370218.623360
onednn_verbose,info,gpu,gemm,consider:8x1,2x1x16,score:89634940.039680
onednn_verbose,info,gpu,gemm,consider:16x16,8x4x1,score:3680416.698880
onednn_verbose,info,gpu,gemm,consider:32x8,4x4x2,score:3700671.028160
onednn_verbose,info,gpu,gemm,consider:16x8,4x4x4,score:4982393.981202
onednn_verbose,info,gpu,gemm,consider:16x8,2x4x4,score:4863343.393024
onednn_verbose,info,gpu,gemm,consider:16x4,2x8x4,score:7501812.513331
onednn_verbose,info,gpu,gemm,consider:16x8,8x4x1,score:3617596.905280
onednn_verbose,info,gpu,gemm,consider:16x8,4x8x1,score:3996614.456960
onednn_verbose,info,gpu,gemm,consider:16x8,4x4x1,score:3666412.015360
onednn_verbose,info,gpu,gemm,consider:16x4,2x8x1,score:3797270.341120
onednn_verbose,info,gpu,gemm,consider:32x1,4x1x1,score:3854629.275264
onednn_verbose,info,gpu,gemm,consider:16x1,4x1x8,score:21370218.623360
onednn_verbose,info,gpu,gemm,consider:8x1,2x1x16,score:89634940.039680
...

Zephyr596 avatar Feb 19 '24 05:02 Zephyr596