exiv2 icon indicating copy to clipboard operation
exiv2 copied to clipboard

GIMP crash caused by setlocale bug on Windows (via Exiv2 in the call stack)

Open Wormnest opened this issue 8 months ago • 30 comments

Describe the bug

We (GIMP) have received several crash issues on Windows that proved difficult to diagnose. Our Ms Store handler finally received a crash log that indicates a possible relation to exiv2.

In other cases the telltale sign is libc++abi: terminating due to uncaught exception of type std::bad_alloc: std::bad_alloc, which at this time we are assuming to be the same issue. (Most of GIMP itself is C, not C++)

To Reproduce

Steps to reproduce the behavior:

  1. GIMP's Windows developers have not been able to reproduce on their systems. We depend on users reporting this.
  2. They try to load an image and then they get a crash, usually without details.

Expected behavior

Loading of images including loading their metadata through gexiv2 -> exiv2 succeeds.

Desktop (please complete the following information):

  • OS and version: so far this seems to be Windows only
  • Exiv2 version and source: GIMP 3.0.0 and 3.0.2 use 0.28.5, RC3 where this also happened may have used an earlier version, all supplied by MSYS2, CLANG64 profile.
  • Compiler and version: clang as provided by MSYS2
  • Compilation mode and/or compiler flags:

Additional context

Looking at the trace linked above, I wonder if this could be related to the code that replaced the Data handling regex, maybe in combination with locale handling? Just a wild guess, I'm not a C++ expert.

I wish we could give more details, but I thought it good to make you aware of this.

Wormnest avatar Apr 01 '25 19:04 Wormnest

If someone needs it, I can send the memory dump, minidump and other files we got from Windows tools about this bug.

brunvonlope avatar Apr 01 '25 19:04 brunvonlope

Are you able to reproduce this in GIMP with this image? https://gitlab.gnome.org/GNOME/gimp/-/issues/12626#note_2368064

I've run exiv2 on that file from a Linux command line and it looks very boring: I don't see any weird values that would be likely to trigger a bug. So I'm wondering if the bug only happens with the larger 347 MB file that's mentioned in the thread.

kevinbackhouse avatar Apr 02 '25 11:04 kevinbackhouse

By the way, I don't think this is necessarily date-related. Is that hypothesis based on the mention of ZNK5Exiv29Exifdatum8toStringEv in the stack trace?

kevinbackhouse avatar Apr 02 '25 12:04 kevinbackhouse

Are you able to reproduce this in GIMP with this image?

No, but GitLab removes most metadata due to privacy concerns.

By the way, I don't think this is necessarily date-related. Is that hypothesis based on the mention of ZNK5Exiv29Exifdatum8toStringEv in the stack trace?

Yes, based on that together with knowing that the regex was replaced fairly recently, but as I mentioned above, just a wild guess since I've only glanced at exiv2 code a few times.

Wormnest avatar Apr 02 '25 14:04 Wormnest

Are you able to reproduce this in GIMP with this image?

No, but GitLab removes most metadata due to privacy concerns.

I'm happy to take a look, but I do need access to the image that causes the problem.

kevinbackhouse avatar Apr 02 '25 14:04 kevinbackhouse

No surprise here. Regex replacement is not a simple task. The code works well enough to only pass the test suite.

neheb avatar Apr 10 '25 01:04 neheb

It happens for any png I try to load. I just made a white 16x16 px png in photopea and zipped it for upload here:

16x16.zip

no special metadata afaik.

Just clicking the png file in the file selector (so even before opening, but I guess it tries to make a preview) crashes gimp 3.0.2-1.

$ gimp-3.exe --verbose --console-messages
> .....
> libc++abi: terminating due to uncaught exception of type std::bad_alloc: std::bad_alloc

Matsemann avatar Apr 23 '25 12:04 Matsemann

If this is a date issue, it might be related to regional settings and the various date formats. I was getting constant crashes with Gimp 3.0.2-1 (on Windows 11) today, both when trying to save/export anything, or loading an image with the open dialog. Same message as described above:

libc++abi: terminating due to uncaught exception of type std::bad_alloc: std::bad_alloc

I'm using Norwegian regional settings, so todays date would be written as 13.05.2025 changed to US regional format, 5/13/2025 and all the open/save/export crashes went away.

If it's trying to parse the 13th as month 13, I'm guessing that could potentially cause issues.

jone-l avatar May 13 '25 17:05 jone-l

So effectively the same issue as https://github.com/Exiv2/exiv2/pull/3264

Had this been backtraced with gdb?

neheb avatar May 13 '25 19:05 neheb

Huh, I can confirm. Changing from Norwegian Bokmål (Norway) regional format on my Windows computer to English (United States) allowed me to open files that used to crash. Changing back and restarting gimp they crash again.

Matsemann avatar May 13 '25 19:05 Matsemann

Windows 11 has a UTF8 option. Does that also crash?

neheb avatar May 13 '25 19:05 neheb

The Norwegian locale possibly points in the direction of this. In GIMP we set UTF-8 in the Windows manifest. I've wanted to make a test build with that reverted, but we have some unrelated issues building with gcc 15.1 that need to be fixed first.

Wormnest avatar May 14 '25 02:05 Wormnest

For the record, Norwegian Nynorsk (Norwegian) seems to work fine (pretty much identical formats as the bokmål one, but no utf-8 chars in name), so I'm using that as a workaround for now.

Aren't many locales with potential problem-characters, but found that these two also seem to cause crashes reliably:

  • Dutch (Curaçao)
  • French (Côte d'Ivoire)

jone-l avatar May 14 '25 08:05 jone-l

I've wanted to make a test build with that reverted, but we have some unrelated issues building with gcc 15.1 that need to be fixed first.

@Wormnest You can test the msix by removing the ActiveCodePage there too https://gitlab.gnome.org/GNOME/gimp/-/blob/master/build/windows/store/AppxManifest.xml#L75

brunvonlope avatar May 14 '25 08:05 brunvonlope

I forgot to follow up here. The one user who tested our test build where we did not set UTF-8 did not see a difference.

We now do have confirmation that changing the regional format settings in Windows from Norwegian bokmål to English fixes the crashes. As I have seen duplicate reports that suggest Turkish users, this together is a strong indication of the link I mentioned earlier.

Since GIMP not setting locale to UTF-8 in our Windows manifest didn't change anything, I am not sure there is anything more we can do on our end.

Most likely this is happening in either gexiv2 or exiv2 due to the relation with importing/exporting images and that being c++ code (GIMP is mostly C).

I've thought of trying one of the affected languages myself, but my OS harddrive is low on space so I won't be doing that anytime soon.

Wormnest avatar May 26 '25 14:05 Wormnest

A better stack trace can now be found here. Not sure how easy it is to work around something that seems like a Windows bug.

Wormnest avatar Aug 20 '25 15:08 Wormnest

Please could somebody try this fix and let me know if it works? https://github.com/Exiv2/exiv2/pull/3359 It's basically a guess, based on the hypothesis that setlocale is to blame.

I've spent the whole day trying to setup a Windows VM in Azure, so that I can build and debug GIMP, but I'm down a deep rabbit hole trying to do stuff like figuring out how to install winget, so I'm not even close to having a working dev environment yet.

kevinbackhouse avatar Aug 23 '25 19:08 kevinbackhouse

I'm down a deep rabbit hole trying to do stuff figuring out how to install winget

Winget is not required to build exiv2 nor gimp. Msys2 provides a gui installer in their page, you can find it on Google

Also, is that a Windows 10 VM or Windows Server VM? Because standard windows 10 (not server) have winget since build 1809

brunvonlope avatar Aug 23 '25 20:08 brunvonlope

Also, is that a Windows 10 VM or Windows Server VM? Because standard windows 10 (not server) have winget since build 1809

Windows Server 2022

kevinbackhouse avatar Aug 23 '25 22:08 kevinbackhouse

Windows Server 2022

That explains why. winget is avaiable by default only on Windows Server 2025.

But I don't know even if the traditional installer would work either. None of our runners (the one provided by MSYS2 guys and the one provided by ARM64) actually installed msys2 in the traditional way (without winget). Unfortunately, don't know how to help.

brunvonlope avatar Aug 23 '25 23:08 brunvonlope

crash location

I've managed to attach a debugger to this to see what's happening. The crash happens during this line of code:

https://github.com/Exiv2/exiv2/blob/d6c67cdd39970795d078bd8ae457af27bf035296/include/exiv2/value.hpp#L1557

It happens while it's printing *i, which is a uint16_t with value 15. The call to std::setprecision is irrelevant: I tried removing it, but that made no difference. I also tried casting *i to larger type, like uint32_t or size_t, but that also made no difference.

Image

setlocale hypothesis

I believe that I have confirmed that the setlocale hypothesis is correct. I set a breakpoint on setlocale to see where it's called. It's called numerous times during gimp's startup (before the splash screen appears). After that it isn't called again until I try to open a file. It is never called directly by exiv2, but gets called indirectly by snprintf during the printing of *i.

Image

The crash happens immediately afterwards, during the same call to snprintf.

workaround

Given that exiv2 never calls setlocale directly, I don't think this is our bug. As extra confirmation of that, I was able to stop the crash from happening by doing this:

https://github.com/Exiv2/exiv2/pull/3361

Please can others try that to see if it solves the issue? I don't want to merge that PR though, because it's not a good fix. To me, this seems like a bug in Windows.

kevinbackhouse avatar Aug 24 '25 15:08 kevinbackhouse

I'm able to reproduce it in an MSYS2 CLANG64 shell with this program:

#include <iostream>

int main() {
	setlocale(LC_ALL, ".UTF-8");
	uint16_t x = 15;
	std::cout << x << "\n";
	return 0;
}

Compile and run with clang:

clang++ -g test.cpp
./a.exe

It crashes with the same error:

# ./a.exe
libc++abi: terminating due to uncaught exception of type std::bad_alloc: std::bad_alloc

Note: that program works fine when compiled with Visual Studio. So it's something to do with the MSYS2 environment.

kevinbackhouse avatar Aug 24 '25 15:08 kevinbackhouse

Note: that program works fine when compiled with Visual Studio. So it's something to do with the MSYS2 environment.

The test code here though is supposedly happening in Visual Studio (I don't have that installed to test).

Wormnest avatar Aug 24 '25 17:08 Wormnest

Compiling the code above on MSVC 2022 I get:

PS cl /Zi /DEBUG test.cpp /Fe:a.exe

C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\include\__msvc_ostream.hpp(781): warning C4530: manipulador de exceção de C++ usado, mas semântica de liberação não está habilitada. Especifique /EHsc
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207\include\__msvc_ostream.hpp(781): note: o contexto de instanciação do modelo (o mais antigo primeiro) é
test.cpp(6): note: consulte a referência à instanciação 'std::basic_ostream<char,std::char_traits<char>> &std::operator <<<std::char_traits<char>>(std::basic_ostream<char,std::char_traits<char>> &,const char *)' do modelo que está sendo compilada

It does not crash (as already said above):

PS .\a.exe
15

brunvonlope avatar Aug 24 '25 17:08 brunvonlope

The code linked by Jacob compiles fine (with no warnings) on MSVC and does not crash.

brunvonlope avatar Aug 24 '25 17:08 brunvonlope

Note: that program works fine when compiled with Visual Studio. So it's something to do with the MSYS2 environment.

The test code here though is supposedly happening in Visual Studio (I don't have that installed to test).

I cannot reproduce the crash with that test case. I've tried it with both Visual Studio and MSYS2 CLANG64 and both are fine.

kevinbackhouse avatar Aug 24 '25 18:08 kevinbackhouse

Ignore my previous comment. The bug is also reproducible with Visual Studio. The code from the old bug report has a typo (a space character in "en_US. UTF-8"). With the typo fixed, it's reproducible in Visual Studio:

#include <locale.h>

int main()
{
	setlocale(LC_ALL, "en_US.UTF-8");
	setlocale(LC_ALL, "Norwegian Bokm\xE5l_Norway.1252");
}

kevinbackhouse avatar Aug 25 '25 12:08 kevinbackhouse

Hi! Just wondering but....isn't this an issue in clang's libc++?

EDIT: I'm building a debug version of libc++ to see what's causing the terminate() call

lb90 avatar Sep 22 '25 12:09 lb90

Hi @kevinbackhouse! The following code has an issue:

#include <locale.h>

int main()
{
	setlocale(LC_ALL, "en_US.UTF-8");
	setlocale(LC_ALL, "Norwegian Bokm\xE5l_Norway.1252");
}

It starts by setting the narrow string charset to UTF-8. After that, the CRT expects narrow strings to be UTF-8 encoded, however "Norwegian Bokm\xE5l_Norway.1252" is not valid UTF-8.

To avoid such problems one should use _wsetlocale

I have now opened a PR to address this issue in libc++

lb90 avatar Sep 24 '25 10:09 lb90

Huh, I can confirm. Changing from Norwegian Bokmål (Norway) regional format on my Windows computer to English (United States) allowed me to open files that used to crash. Changing back and restarting gimp they crash again.

yes, happens in Turkish also. When I change to English - USA works normally.

bdybSoftware avatar Oct 03 '25 16:10 bdybSoftware