Cao E
Cao E
* Fixes #78611 Reshape tensors witch are channels_last will get unexpected stride. * Fixes empty input convolution issue : when input is empty e.g. shape of (0, 3, 3, 4)...
Add Mx2, Mx4, 2xN, and 4xN specific transposes on avx512 to improve the transpose performance of shapes of Mx2, Mx4, 2xN, and 4xN. * When the shape is Mx2 or...
Add inference, amp and channel_last supports for began.
Make q, k, and v contiguous to get better performance for normalize. After the rearrange operations for q, k, and v, normalizations on the last dim for q and k...
Fixes #ISSUE_NUMBER This PR depends on https://github.com/intel/ideep/pull/165
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #123514 It is part of https://github.com/pytorch/pytorch/issues/123224. Set simdlen based on the environment ATEN_CPU_CAPABILITY to control CPU vec ISA like eager. cc @voznesenskym...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #133194 * #131879 * __->__ #131878 cc @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen @snadampal
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #133194 * #131879 * #131878 cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #133194 * __->__ #131879 * #131878 cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10