Cao E

Results 15 issues of Cao E

* Fixes #78611 Reshape tensors witch are channels_last will get unexpected stride. * Fixes empty input convolution issue : when input is empty e.g. shape of (0, 3, 3, 4)...

triaged
open source
cla signed
intel priority
intel

Add Mx2, Mx4, 2xN, and 4xN specific transposes on avx512 to improve the transpose performance of shapes of Mx2, Mx4, 2xN, and 4xN. * When the shape is Mx2 or...

cla signed

Add inference, amp and channel_last supports for began.

Make q, k, and v contiguous to get better performance for normalize. After the rearrange operations for q, k, and v, normalizations on the last dim for q and k...

Use 32x32 block size for i16 transpose.

cla signed

Fixes #ISSUE_NUMBER This PR depends on https://github.com/intel/ideep/pull/165

module: cpu
module: mkldnn
open source
topic: not user facing

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #123514 It is part of https://github.com/pytorch/pytorch/issues/123224. Set simdlen based on the environment ATEN_CPU_CAPABILITY to control CPU vec ISA like eager. cc @voznesenskym...

open source
ciflow/trunk
ciflow/periodic
module: inductor
ciflow/inductor

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #133194 * #131879 * __->__ #131878 cc @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen @snadampal

module: mkldnn
open source
ciflow/trunk
ciflow/inductor
ciflow/linux-aarch64

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #133194 * #131879 * #131878 cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225...

module: cpu
open source
module: inductor
module: dynamo
ciflow/inductor

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #133194 * __->__ #131879 * #131878 cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

module: cpu
open source
module: half
ciflow/trunk
ciflow/inductor