Xu Han

Results 16 issues of Xu Han

in ObjDump: 0000000007f01e90 : 7f01e90: f3 0f 1e fa endbr64

original build option from makefile: > gcc -DNDEBUG -DLIBXSMM_NOFORTRAN -DLIBXSMM_TARGET_ARCH=1006 -DLIBXSMM_OPENMP_SIMD -DLIBXSMM_BUILD=2 -Iinclude -I./src -msse4.2 -fPIC -Wall -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -pthread -Werror -c ./src/generator_mateltwise.c -o obj/intel64/generator_mateltwise.o...

1. Add "SLEEF_BUILD_SHARED_LIBS" to instead of CMake reserved variable "BUILD_SHARED_LIBS". 2. Add clear library type to add_library, it will remove the global control from "BUILD_SHARED_LIBS", Link: https://cmake.org/cmake/help/latest/guide/tutorial/Selecting%20Static%20or%20Shared%20Libraries.html Additional, I write...

Fix "SLEEF_BUILD_SCALAR_LIB" lost function to control build sleefscalar lib. |Status|Control Build|Control Install| |----|----|----| |Current Code| No |Yes| |This PR|Yes|Yes

Warning message on MSVC: ```cmd 28>D:\xu_github\sleef\src\libm\sleefsimddp.c(28,9): warning C4068: unknown pragma 'STDC' 28>sleefsimdsp.c 28>D:\xu_github\sleef\src\libm\sleefsimdsp.c(28,9): warning C4068: unknown pragma 'STDC' 28>Generating Code... ``` Fix it by conditional defination for disable fp contractions.

Fixes #119304 1. Add try catch to handle the compiler version check. 2. Retry to query compiler version info. 3. Return False if can't get compiler info twice. cc @jgong5...

module: cpu
open source
ciflow/trunk
topic: bug fixes
intel
release notes: inductor
merging

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #116178 POC link: https://github.com/xuhancn/x86_isa_help Change logs: 1. Use new CppBuilder which also support Windows MSVC. 2. Add cpuid based x86 isa detector,...

open source
Stale
intel
module: inductor
ciflow/inductor

Previous full PR https://github.com/pytorch/pytorch/pull/115248 is failed to merge due to fb_code is hard to debug. I also tried to submit them as two pieces, https://github.com/pytorch/pytorch/pull/118514 https://github.com/pytorch/pytorch/pull/118515. And they have passed...

module: cpu
triaged
open source
intel priority
ciflow/trunk
module: inductor
module: dynamo
ciflow/inductor

# Summary During our pytorch development, we found Windows system memory alloctor is worse performance, and slow down the whole pytorch performance. After add third party memory alloctor, pytorch improved...

enhancement

Fixes #ISSUE_NUMBER cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

module: cpu
open source
Stale