STL icon indicating copy to clipboard operation
STL copied to clipboard

Should we require SSE2?

Open StephanTLavavej opened this issue 2 years ago • 12 comments
trafficstars

We still support targeting Win7, but before it stopped receiving security updates, it was patched to require SSE2. This was KB4103718 in May 2018. It's now 5 years later. I think it would be safe for the STL to begin assuming unconditional support for SSE2, because this would cause problems only if:

  • An end user has a still-functioning machine with a truly ancient non-SSE2 processor,
  • They're running Win7 without any security updates past May 2018,
  • They receive x86 software from a user-programmer using a fresh version of VS 2022 (whether an updated program itself, or a VCRedist)

This seems like it's not worth worrying about anymore. The changes on the STL's side would be minimal, but would be an improvement for all other x86-targeting user-programmers - we would remove /arch:IA32 from the STL's build system (and test coverage), and the vectorized algorithms could begin assuming unconditional support for SSE2.

StephanTLavavej avatar Aug 04 '23 17:08 StephanTLavavej

What are gains? For separately compiled vector algorithms they are minimal

AlexGuteniev avatar Aug 04 '23 17:08 AlexGuteniev

Slightly fewer configurations to test.

StephanTLavavej avatar Aug 04 '23 17:08 StephanTLavavej

Personally I don't care much -- the project I'm working on simply requires x64.

Yet, I would see such a decision as some lack of discipline. If the compiler and the CRT supports a mode, why would STL silently break? I'd expect a consistent decision across the toolset, with another decision, like should the last version of toolset for a previous mode be canned for some extended use, like the XP toolset, or not.

AlexGuteniev avatar Aug 04 '23 17:08 AlexGuteniev

this would cause problems only if:

  • An end user has a still-functioning machine with a truly ancient non-SSE2 processor,
  • They're running Win7 without any security updates past May 2018,
  • They receive x86 software from a user-programmer using a fresh version of VS 2022 (whether an updated program itself, or a VCRedist)

My experience also suggests that the amount of users for which this may be a problem may be very small and insignificant for you to care of, but it is not likely to be zero.

Like you have million developer users with billion end users, and you expect the set for your criteria is still empty? Some users may be servicing old production machines or kiosks or something, getting updates to their IDE, possibly not even realizing they are updating the STL with that, and they will be broken by such a change.

I'm not really defending keeping a legacy mode forever, but I still think that the removal should be graceful, with blocking the legacy mode across the toolset and a sympathetic apology message.

AlexGuteniev avatar Aug 04 '23 17:08 AlexGuteniev

If the compiler and the CRT supports a mode, why would STL silently break?

There's some precedent - we dropped XP/Vista (the compiler and CRT don't care the way we do), and we also blocked /RTCc which the compiler supports but which is utterly intolerable for the STL.

We'll talk about this at the next weekly maintainer meeting; I do expect that the set is either empty or limited to malware botnets. The machines really are physically breaking down at this point, and even Windows didn't want to continue issuing security updates at the end of their Win7 lifecycle (and we've now waited an additional 5 years beyond that!).

StephanTLavavej avatar Aug 04 '23 19:08 StephanTLavavej

Just to make sure I understand this correctly: The only prerequisite is a SSE2 capable processor - nothing has to be specifically activated by the OS/Hypervisor correct?

MikeGitb avatar Aug 07 '23 05:08 MikeGitb

Correct.

StephanTLavavej avatar Aug 08 '23 00:08 StephanTLavavej

A hypervisor may allow changing CPUID. If it can hide SSE (whole SSE, not just SSE2), then the OS will not maintain the corresponring architectural state, and SSE2 code will break.

Not sure if this can actually be achived. Found information on some VMWare product, the least it allows to hide is SSE3.

AlexGuteniev avatar Aug 08 '23 05:08 AlexGuteniev

Museum software, running on museum OS's, can continue to use museum compiler toolsets. Ensure an appropriate download link is still alive for the older toolsets and move on.

Reminder also that Win8.1 consumer went out of extended support in Jan and Server2012 will go out of extended support in October. Let em die too.

rboxman avatar Aug 09 '23 04:08 rboxman

We talked about this at the weekly maintainer meeting and are in favor of requiring SSE2 unconditionally.

Note that this will require very minor internal build system changes to mirror the /arch:IA32 removal. (A 5-second search didn't find it, but I know this setting is squirreled away somewhere.)

StephanTLavavej avatar Aug 09 '23 21:08 StephanTLavavej

It has occurred to me that vector_algorithm.cpp non-SSE2 mode is used for ARM64EC. I guess the VSO_0000000_vector_algorithms should continue covering non-SSE2 until we can run ARM64RC tests here. (This doesn't prevent /arch:IA32 removal from test matrices though.)

AlexGuteniev avatar Aug 13 '23 11:08 AlexGuteniev

It appears that the internal build system is globally setting ClArchx86 to IA32 in src/tools/Microsoft.DevDiv.Native.targets. This takes effect only when <When Condition="'$(BuildArchitecture)' == 'i386'"> and is otherwise ignored. A project can set <ClArchx86>SSE2</ClArchx86>, which src/vctools/StaticAnalysis/StaticAnalysis.Common.props is already doing.

I see a couple of options for the internal build - we could modify src/vctools/crt/crt_build.settings.targets to take effect across most of the VCLibs (including VCRuntime), or we could modify src/vctools/crt/github/stl/msbuild/stl_base/libcp.settings.targets for the static lib and src/vctools/crt/github/stl/msbuild/stl_base/msvcp.settings.targets, src/vctools/crt/github/stl/msbuild/stl_1/msvcp_1.settings.targets, etc. for the DLLs, which would be targeted to the STL.

StephanTLavavej avatar Oct 20 '23 23:10 StephanTLavavej

Blocked by VSO-1952930 "x86 /O2 /arch:SSE2 silent bad codegen in vectorized std::min_element for int64_t".

This repros with VS 2022 17.9 Preview 5 x86 as soon as I remove /arch:IA32 from CMakeLists.txt.

Internally, I tried removing all ClArchx86 machinery (in src/tools/Microsoft.DevDiv.Native.targets, src/vctools/StaticAnalysis/StaticAnalysis.Common.props, and src/vctools/VCTools.DevDiv.Cpp.targets) to build the entire toolset with /arch:SSE2, but encountered ~90 test failures, some beyond std::MEOW_element. It's unclear whether these were cascaded from the damaged <algorithm> behavior, or if other codegen bugs are involved.

StephanTLavavej avatar Feb 07 '24 12:02 StephanTLavavej

I basically duplicated #3118 with somewhat different discussion. :joy_cat:

StephanTLavavej avatar Feb 07 '24 20:02 StephanTLavavej

Internally, I tried removing all ClArchx86 machinery (in src/tools/Microsoft.DevDiv.Native.targets, src/vctools/StaticAnalysis/StaticAnalysis.Common.props, and src/vctools/VCTools.DevDiv.Cpp.targets) to build the entire toolset with /arch:SSE2, but encountered ~90 test failures, some beyond std::MEOW_element. It's unclear whether these were cascaded from the damaged <algorithm> behavior, or if other codegen bugs are involved.

Can try again and see

AlexGuteniev avatar May 23 '24 05:05 AlexGuteniev

GitHub's happy now. For the MSVC-internal build, I also had to edit src/vctools/langapi/fmath/fmath.c to remove two #if ((FMATH_DEBUG) && (TARGET_X86) && (_M_IX86)) blocks implementing an assertion that was assuming /arch:IA32.

With that, the only remaining failure is affecting 5 tests (GH_002030_asan_annotate_string and GH_002030_asan_annotate_vector in std, plus 3 more in the compiler and vcr suites):

AddressSanitizer: CHECK failed: interception_win.cpp:172 "(("Interception failure, stopping early. Set ASAN_WIN_CONTINUE_ON_INTERCEPTION_FAILURE=1 to try to continue." && 0)) != (0)" (0x0, 0x0) (tid=19236)

StephanTLavavej avatar May 25 '24 17:05 StephanTLavavej

I figured out how to fix this with internal PRs:

StephanTLavavej avatar May 30 '24 17:05 StephanTLavavej