oneDNN
oneDNN copied to clipboard
cpu: aarch64: add eltwise post ops to ACL bnorm
Description
This PR adds eltwise post ops to the Compute Library for the Arm® architecture (ACL) batch normalization primitive.
ReLU (including leaky and bounded) are fused into the bnorm operator. Other eltwise ops are handled using acl_post_ops_t
and so supports any that acl_eltwise_fwd_t
supports.
Adding this functionality just means that the ACL impl now covers more cases. In the newly supported case where ReLU is passed in as a post op (rather than a flag) batchnorm performance improvements over nspc:bnorm
will be the same as in #1378. The other eltwise post ops which are now incidentally covered are unimplemented in nspc:bnorm
and ref
, so there are no performance considerations there.
This PR requires and includes f6dbe1efdd08c88924d9f0f4368abd435473ffe5 cherry picked from #1330, so this PR should not be merged before #1330. I couldn't target that PR branch directly using GitHub because it is on a fork.
Checklist
General
- [X] Do all unit and benchdnn tests (
make test
andmake test_benchdnn_*
) pass locally for each commit? - [X] Have you formatted the code using clang-format?
Performance improvements
- [X] Have you submitted performance data that demonstrates performance improvements?
The changes landed into master and were backported into rls-v2.7. Thanks for the contribution!
There were some issues with Gihub updating the PR changes once the #1330 was promoted (it was still showing 2 commits instead of 1), so I rebased jondea:add-post-ops-to-acl-bnorm
on top of master.
Great, thank you! Not sure why that happened, but thanks for sorting it 😃