runtime icon indicating copy to clipboard operation
runtime copied to clipboard

Add option to change SVE vector length for current and children processes

Open SwapnilGaikwad opened this issue 1 year ago • 7 comments
trafficstars

Current coreclr assumes SVE vector length as 128 bits leading it to limit size of Vector<T> to 16 bytes. While executing on a platform offerring higher vector lengths, such as 256 bits, lead to using registers of larger size. This leads to incorrect results, e.g., while using unzip even (uzp1 instruction). Here C# expects processing based on half of the vectors (size 128bits) while the actual result is based on full vectors (size 256bits).

Add DOTNET_MaxVectorLength=N flag where N is desired SVE vector length in bytes for the current execution. Let M is the max/current vector length and N is the vector length specified with DOTNET_MaxVectorLength option, then V is the new vector length for the current execution.

  • If N < M, N % 16 == 0 (a valid vector length), V = N
  • If N < M, N % 16 != 0 (an invalid vector length), V = M
  • If N > M (N can be a valid or invalid length), V = M

SwapnilGaikwad avatar Apr 19 '24 14:04 SwapnilGaikwad

Tagging subscribers to this area: @mangod9 See info in area-owners.md if you want to be subscribed.

@kunalspathak @a74nh @dotnet/arm64-contrib @arch-arm64-sve

SwapnilGaikwad avatar Apr 19 '24 14:04 SwapnilGaikwad

This patch is required for #101294 while running on a system that offers SVE vector length greater than 128.

SwapnilGaikwad avatar Apr 19 '24 14:04 SwapnilGaikwad

This PR fixes the issues I've been seeing implementing AddAcross.

We've both been using V1 machines. On an N2 this PR is not required, and won't cause any effects.

a74nh avatar Apr 19 '24 14:04 a74nh

Updated the PR to do the following On Linux

  • Disable SVE when the user specifies vector length that's smaller than the available/OS vector length.
  • Avoid inline assembly and use ACLE (svcntb()) to retrieve the vector length.
    • Use of svcntb() is available from clang-16. On older versions of clang, such as clang-14, including arm_sve.h errors out with SVE support not enabled on non-SVE systems.

On Windows

  • Return a hardcoded vector length of 16-bytes until we find a suitable mechanism to retrieve it.
  • Added a TODO to note the current limitation.

SwapnilGaikwad avatar May 03 '24 16:05 SwapnilGaikwad

  • Use of svcntb() is available from clang-16. On older versions of clang, such as clang-14, including arm_sve.h errors out with SVE support not enabled on non-SVE systems.

This is because the __attribute__ sve is only supported from clang-16.

My only concern here is that this is pushing the minimum supported version of clang to clang-16. I'm not sure what the constraints for building coreclr are.

It would be the same if replaced with inline asm.

Alternative would be to write the asm in hex as suggested above.

a74nh avatar May 03 '24 16:05 a74nh

Converted the PR to draft until the build issues on non-linux systems are solved.

SwapnilGaikwad avatar May 08 '24 10:05 SwapnilGaikwad

Hi @jkotas, do you have any suggestions to fix this build error - unknown opcode: rdvl, for Windows on Arm64 ? Potentially any current places where we use a hex code for an instruction.

SwapnilGaikwad avatar Jun 05 '24 15:06 SwapnilGaikwad

do you have any suggestions to fix this build error - unknown opcode: rdvl, for Windows on Arm64 ?

I do not think we need to be creative: https://github.com/dotnet/runtime/pull/101295/files#r1628086182

jkotas avatar Jun 05 '24 16:06 jkotas

do you have any suggestions to fix this build error - unknown opcode: rdvl, for Windows on Arm64 ?

I do not think we need to be creative: https://github.com/dotnet/runtime/pull/101295/files#r1628086182

Currently, GetSveLengthFromOS() will return different values each time depending on the content of x0 for windows/arm64. Until we upgrade the masm on CI machine (not sure how frequently that happens), we should at least make it return 128 for windows/arm64.

kunalspathak avatar Jun 05 '24 17:06 kunalspathak