Slicer icon indicating copy to clipboard operation
Slicer copied to clipboard

CLI executable terminates with fault due to incompatible system C++ redistributable

Open jamesobutler opened this issue 1 year ago • 13 comments

Summary

I have recently built Slicer (as of https://github.com/Slicer/Slicer/commit/1eccd354e0770a198dbfecd5a372bd1d9b3ef16c) using Visual Studio 2022 17.10 with MSVC 1940. I am observing that the CLI module executables such as ResampleScalarVolume.exe (located at "%LOCALAPPDATA%\slicer.org\Slicer 5.7.0-2024-07-02\lib\Slicer-5.7\cli-modules\ResampleScalarVolume.exe") are utilizing the MSVC runtime dlls from the system instead of the ones included at "%LOCALAPPDATA%\slicer.org\Slicer 5.7.0-2024-07-02\bin". This results in ResampleScalarVolume.exe failing with the following message reported on the python console -> [VTK] Resample Scalar Volume terminated with a fault. image

If I copy the various msvcp dlls from "%LOCALAPPDATA%\slicer.org\Slicer 5.7.0-2024-07-02\bin" to be located next to the CLI module executable then things begin to work successfully. image

Steps to reproduce

Build and package 3D Slicer using MSVC 1940 and then install onto a computer with an older redistributable version. In my case I was running on a system with older 14.31.31103 when I observed ResampleScalarVolume terminate with a fault. Note that I have found it difficult to downgrade a system with latest redistributable to an older one as it seems like Windows will try to keep the newer one around. image

Then use MRHead sample data and try running ResampleScalarVolume module with it.

Environment

  • Slicer version: Slicer (as of https://github.com/Slicer/Slicer/commit/1eccd354e0770a198dbfecd5a372bd1d9b3ef16c)
  • Operating system: Windows

Possible user report of the same issue leading to CLI terminating with fault: https://discourse.slicer.org/t/model-to-model-distance-computation-is-completed-with-error/23901/3?u=jamesobutler

jamesobutler avatar Jul 09 '24 18:07 jamesobutler

@jcfr Is the app launcher supposed to do any handling of paths to make sure the CLI module executables are running against the msvcp runtime dlls in the application /bin directory?

The msvcp files in general appear to make it into the /bin directory by way of: https://github.com/Slicer/Slicer/blob/532b541f45a31fbc6ce22a86dc5549116f37cb4c/CMake/SlicerCPack.cmake#L162

jamesobutler avatar Jul 09 '24 18:07 jamesobutler

Probably the most robust solution would be to build the launcher with startic linking, including the MSVC runtime. Otherwise the DLLs would need to be in the same folder olad the exe, or we would need to use application config file or manifests to make the loader find the DLLs in the bin subfolder.

@jcfr have tou tried to build the windows launcher with static multithreading runtime? (/MT - see https://learn.microsoft.com/en-us/cpp/build/reference/md-mt-ld-use-run-time-library?view=msvc-170).

lassoan avatar Jul 10 '24 04:07 lassoan

I have issued https://github.com/Slicer/SlicerExecutionModel/pull/142 as a workaround to include system libraries next to these CLI executables.

Using that PR I observe ..lib/Slicer-5.7/cli-modules/msvcp140.dll being specified along with the existing ..bin/msvcp140.dll in the following build artifact. This results in the package including these msvcp dlls in the cli-modules install location. "C:\S5R\Slicer-build\install_manifest_RuntimeLibraries.txt"

jamesobutler avatar Jul 10 '24 14:07 jamesobutler

@jamesobutler how do you run "%LOCALAPPDATA%\slicer.org\Slicer 5.7.0-2024-07-02\lib\Slicer-5.7\cli-modules\ResampleScalarVolume.exe"? Can you provide a full command line?

The launcher sets up paths, but DLLs are ignored if there are suitable DLLs in system folders, because the DLL search order is something like:

  • folder of the executable
  • system folders
  • additional folders in PATH environment variable

What is unexpected that the DLLs that were found in system folders did not work. These things happened long time ago, when DLL hell was still a thing, but this should not occur anymore. If the loader finds that the DLLs are compatible (based on DLL name, manifest, etc.) then they should work well and should not make the application crash.

One potential root cause is that some software incorrectly installed garbage DLLs in the system folders. Applications should not install anything into the system32 folder (shared DLLs should be installed as side-by-side assemblies in the WinSxS folder), but it is hard to prevent this or detect&undo this. Still, you can do a quick test: temporarily remove the offending msvc*.dlls from the system32 folder and see if it fixes the issue.

What is this lib/Freud-5.7/cli-modules/msvcp140.dll? Is it a custom Slicer application?

lassoan avatar Jul 11 '24 18:07 lassoan

@lassoan I was running the CLI executable from within a Slicer scripted module. I was calling something like slicer.cli.run(slicer.modules.resamplescalarvolume, None, parameters, wait_for_completion=True, update_display=False). However I was also running it straight through the user interface and observing the CLI executable terminating with fault. I switched the current module to ResampleScalarVolume in the Slicer application and ran against the MRHead sample dataset.

What is this lib/Freud-5.7/cli-modules/msvcp140.dll? Is it a custom Slicer application?

Apologies, yes for a custom application name. I have updated to reflect lib/Slicer-5.7/cli-modules/msvcp140.dll

jamesobutler avatar Jul 11 '24 19:07 jamesobutler

From https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170

The Redistributable version must be at least as recent as the MSVC build toolset used to build your app.

In my case I built the ResampleScalarVolume.exe with the latest v143 toolset and MSVC 1940. However when I packaged the full Slicer application and installed onto another computer, that computer only had up to the 14.31.31103 redistributable instead of 14.40.33810.0. The compatibility for Visual Studio 2015-2022 seems to all be backwards compatibility, but breaks down if you don't have the latest version of the 2015-2022 redistributable and your app was built with a newer one. In this case the Slicer application was appropriately packaged (include(InstallRequiredSystemLibraries)) and had the various msvcp.dll installed next to the SlicerApp-real.exe in the bin directory, but the various msvcp.dll were not installed next to the executables in the CLI module directory and so it appears they are falling back to the System location instead of the one used by SlicerApp-real.exe which we know for sure will be compatible.

jamesobutler avatar Jul 11 '24 19:07 jamesobutler

We should not need any redistributables, as the launcher is built statically and all other DLLs are in the paths set by the launcher.

Could you try to temporarily remove the MSVC DLLs in the system32 folder to see if those incorrectly installed DLLs are getting used instead the correct ones that Slicer provides? (because system32 folder has priority over additional folders in the path) If that's indeed the root cause of the issue then maybe we could make the loader ignore the stray DLLs in System32 by using manifests.

Installing the MSVC DLLs next to every .exe and .dll that uses them is not a good workaround, because we would need to copy the MSVC DLLs to several folders (each extension's CLI folders, maybe other binary folders).

lassoan avatar Jul 11 '24 20:07 lassoan

The Slicer launcher executable is built statically, but the CLI executables are not built statically because they have corresponding ResampleScalarVolumeLib.dll and are seemingly depending on the system MSVC runtimes?

Could you try to temporarily remove the MSVC DLLs in the system32 folder to see if those incorrectly installed DLLs are getting used instead the correct ones that Slicer provides?

Ok I manually removed the older 14.31.31103 version msvcp DLLs located in the system32 folder (which are represented by this install entry) image and ResampleScalarVolume then was successful. When I added them back ResampleScalarVolume went back to failing and producing [VTK] Resample Scalar Volume terminated with a fault in the Slicer python console. My ResampleScalarVolume.exe and associated Slicer application was built with latest v143 toolset with newer runtime 14.40.33810.0.

jamesobutler avatar Jul 11 '24 21:07 jamesobutler

The Slicer launcher executable is built statically, but the CLI executables are not built statically because they have corresponding ResampleScalarVolumeLib.dll and are seemingly depending on the system MSVC runtimes?

OK, this is good. Only the launcher can be built statically. Due to Qt licensing and binary sizes, everything else uses shared libraries.

Ok I manually removed the older 14.31.31103 version msvcp DLLs located in the system32 folder (which are represented by this install entry

This means that Microsoft messed this up. After many years of keeping MSVC runtime versions future compatible, they have now broken ABI compatibility.

This is now causing issues for many projects, see for example 1, 2, 3 and hundreds more on github.

@jamesobutler could you please check where we can set the _DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR preprocessor definition to apply it to all Slicer core and extensions?

Adding MSVC redistributable installation to the Slicer installer would be another solution, but then we would lose portability of the application (it would not run on all computers). This should work as an immediate solution and then I expect that Microsoft will fix this mistake in some way (e.g., they could disable constexpr mutex constructor by default in MSVC runtimes for a couple of more years, or roll out more recent runtimes as part of Windows update).

lassoan avatar Jul 15 '24 14:07 lassoan

The Redistributable your app uses has a similar binary-compatibility restriction. When you mix binaries built by different supported versions of the toolset, the Redistributable version must be at least as new as the latest toolset used by any app component.

Seems like they have had this type of warning around for awhile, but as you've linked there is finally a change part of VS 17.10 that has impacted our code. I will review how people have been handling it with _DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR and will follow-up with my testing results soon.

jamesobutler avatar Jul 15 '24 16:07 jamesobutler

Yes, the promise has been always for backward compatibility only (not for future compatibility) but in practice there has been good compatibility between different MSVC runtimes for many years.

lassoan avatar Jul 15 '24 17:07 lassoan

Unfortunately I don't think I have the knowledge to complete this issue after trying various builds and tests associated with _DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR. My commit didn't work as I still observed a fault running ResampleScalarVolume CLI using the Slicer GUI.

../Slicer-build/CMakeCache.txt does indeed have the following with my changes where it includes D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR :

//CMake CXX Flags
CMAKE_CXX_FLAGS:STRING='  /DWIN32 /D_WINDOWS /GR /EHsc /bigobj /MP16 /D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR  /bigobj  '

I even placed #define _DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR directly in ResampleScalarVolume.cxx, rebuilt and observed the same fault.

I have confirmed in ../Slicer-build/Modules/CLI/ResampleScalarVolume/ResampleScalarVolume.vcxproj image

At this point I don't think I am making any significant progress. Maybe there is something else going on in regards to the Slicer Execution Model stuff for CLIs which I don't understand.

jamesobutler avatar Jul 16 '24 22:07 jamesobutler

Related:

  • https://github.com/actions/runner-images/issues/10020
  • https://github.com/search?q=_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR&type=commits

jcfr avatar Oct 01 '24 14:10 jcfr

For reference, compiler installed on the windows factory (overload) is the following:

-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.19045.
-- The C compiler identification is MSVC 19.39.33523.0
-- The CXX compiler identification is MSVC 19.39.33523.0

jcfr avatar Jan 20 '25 20:01 jcfr