Bin Bao
Bin Bao
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #130977 * #134639 cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #130977 * __->__ #134639 Summary: benchmarks/dynamo/ci_expected_accuracy/update_expected.py expects a benchmark run config is named as {config}_{benchmark}, and CPU tests should follow the same naming...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #138250 * #138379 * __->__ #138303 Summary: The problem happened after splitting CppWrapperCpu and CppWrapperCpuArrayRef, because CppWrapperCpuArrayRef.generate_index_put_fallback missed a statement. Running test_aot_inductor.py as...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #138250 * #138379 * #138303 Summary: Move use_minimal_arrayref_interface specific logic from CppWrapperCpu to CppWrapperCpuArrayRef. This is a copy-on-write style refactor, to simply...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #138250 * __->__ #138379 * #138303 Summary: Add missing use_minimal_arrayref_interface setting to check_model_with_multiple_inputs. cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #141041 Summary: Fixes https://github.com/pytorch/pytorch/issues/140766. In AOTI's two-pass codegen, the first pass generates triton_per_fused_add_native_layer_norm_4, and the second pass generates triton_red_fused_add_native_layer_norm_4. While this problem...