oneDNN icon indicating copy to clipboard operation
oneDNN copied to clipboard

batchnorm requires consistent in- and output mem format_tags

Open IngmarVoigt2 opened this issue 1 year ago • 4 comments

Summary

Provide a short summary of the issue. Sections below provide guidance on what factors are considered important to reproduce an issue.

Version

3.3.0

Environment

VS2019

Steps to reproduce

setup descriptors for convolution followed by batchnorm

Observed behavior

instantiating dnnl::batch_normalization_forward::primitive_desc throws dnnl_unimplemented from https://github.com/oneapi-src/oneDNN/blob/25596d25116d3fd523f1ac5e32e44cb5e8295a9e/src/common/primitive_desc_iface.cpp#L77

This is likely due to the format_tag::any of the output mem descriptor of the conv according to https://oneapi-src.github.io/oneDNN/group_dnnl_api_convolution.html#doxid-group-dnnl-api-convolution

Memory descriptors can be initialized with dnnl_format_tag_any or with format_kind set to dnnl_format_kind_any

Expected behavior

  • ideally batchnorm would flexibly work with arbitrary combination of in- and output formats
  • otherwise it would be helpful if the exception would be more verbose
  • at the very least the documentation should point out this limitation, since this issue can actually be worked around by converting the input memory descriptor to enforce consistent format tags

IngmarVoigt2 avatar Jun 04 '24 07:06 IngmarVoigt2

Hi @IngmarVoigt2, have you tried running ONEDNN_VERBOSE=all? if so could you also please share the output.

Additional information such as a code snippet of your implementation would be helpful.

Please also refer to the implementation limitations if you haven't already: https://oneapi-src.github.io/oneDNN/dev_guide_batch_normalization.html#implementation-limitations

yehudaorel avatar Jun 04 '24 16:06 yehudaorel

thanks for the quick followup @yehudaorel ! sorry, I didn't see your message back then

The verbose logs are

onednn_verbose,info,oneDNN v3.3.0 (commit N/A)
onednn_verbose,info,cpu,runtime:OpenMP,nthr:12
onednn_verbose,info,cpu,isa:Intel AVX2
onednn_verbose,info,gpu,runtime:none
onednn_verbose,info,graph,backend,0:dnnl_backend
onednn_verbose,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,backend,exec_time
onednn_verbose,primitive,create:cache_miss,cpu,convolution,brgconv:avx2,forward_inference,src_f32:a:blocked:acdb::f0 wei_f32:a:blocked:Acdb16a::f0 bia_f32:a:blocked:a::f0 dst_f32:a:blocked:acdb::f0,,alg:convolution_direct,mb3_ic4oc32_ih16oh16kh3sh1dh0ph1_iw32ow32kw3sw1dw0pw1,5.819
onednn_verbose,primitive,create:cache_miss,cpu,eltwise,jit:avx2,forward_inference,data_f32::blocked:acdb::f0 diff_undef::undef:::,,alg:eltwise_relu alpha:0 beta:0,3x32x16x32,2.9757

Sharing this code is a bit tricky, since the different parts are integrated into a different framework, but basically

m_mkldnn_prim_desc = std::shared_ptr<dnnl::batch_normalization_forward::primitive_desc>(new dnnl::batch_normalization_forward::primitive_desc(*m_mkldnn_engine, dnnl::prop_kind::forward_inference, src_d, out_d, m_epsilon, flags));

is where it ultimately fails. src_d is the output descriptor from a convolution layer (and activation as you may be able to tell from the verbose logs above).

Originally I was able to work around by enforcing a different input descriptor mem format, but that does not seem to work well for me in all situations.

Any ideas based on the logs?

IngmarVoigt2 avatar Jun 20 '24 17:06 IngmarVoigt2

Also thanks for pointing me to the documentation, but as far as I could see none of these should matter in my case. Interestingly this is not an issue when using in-place batch norm operations, but unfortunately I cannot enforce this consistently across all components (unless I would ultimately copying around data as a workaround?)

IngmarVoigt2 avatar Jun 20 '24 17:06 IngmarVoigt2

Nevermind, I actually just solved it using

auto dst_d = dnnl::memory::desc(outShape, dnnl::memory::data_type::f32, dnnl::memory::format_tag::any);

Maybe you could add this to the docs?

IngmarVoigt2 avatar Jun 20 '24 17:06 IngmarVoigt2