Rohanjames1997 issues

Results 6 issues of


                                            Rohanjames1997

Add UT for NEON implementation of vec_reduce_all

All these changes are required to add a UT for the NEON implementation of `vec_reduce_all` that I introduced in #105590. This enables reuse of the existing tests in aten/src/ATen/test/vec_test_all_types.cpp As...

module: cpu

triaged

open source

ciflow/trunk

release notes: sparse

[Inductor] Add support for NEON ISA in the Inductor C++ backend

Fixes #104729 As suggested in the [blog](https://dev-discuss.pytorch.org/t/torchinductor-update-5-cpu-backend-backend-performance-update-and-deep-dive-on-key-optimizations/1117#:~:text=It%20can%20be,sub%2Dclasses.), I subclassed the `VecISA` class and implemented a NEON version of the `vec_reduce_all()` function, to go along with the existing AVX2 and AVX512...

module: cpu

triaged

open source

module: inductor

ciflow/inductor

release notes: inductor

Add RoBERTa to the model suite

https://huggingface.co/FacebookAI/xlm-roberta-base

cla signed

aarch64: Add build configs for ACL and onednn+ACL

Tested along with https://github.com/openxla/xla/pull/16527

ENH: Speed up umath functions using NEON/SVE | SIMD

### Proposed new feature or change: Similar to how https://github.com/numpy/numpy/pull/21955 vectorized umath functions using AVX512 FP16, I'm interested in leveraging NEON/SVE to get similar benefits for aarch64 processors. I'd be...

Run CI on Github-hosted arm64 runners too

This PR enables CI on Github-hosted arm64 runners that are now [available for free](https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/) in public repositories Related to #11275

devops