go
go copied to clipboard
runtime/race: ThreadSanitizer failed to allocate 0x0000005c9000 (6066176) bytes at 0x200dc940a0000 (error code: 87)
What version of Go are you using (go version
)?
$ go version go version go1.16.4 windows/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env set GO111MODULE=on set GOARCH=amd64 set GOBIN= set GOCACHE=C:\Users\---\AppData\Local\go-build set GOENV=C:\Users\---\AppData\Roaming\go\env set GOEXE=.exe set GOFLAGS= set GOHOSTARCH=amd64 set GOHOSTOS=windows set GOINSECURE= set GOMODCACHE=c:\---\pkg\mod set GONOPROXY= set GONOSUMDB= set GOOS=windows set GOPATH=c:\--- set GOPRIVATE= set GOPROXY=https://proxy.golang.org,direct set GOROOT=c:\---\Go\go1.16.4 set GOSUMDB=sum.golang.org set GOTMPDIR= set GOTOOLDIR=c:\---\Go\go1.16.4\pkg\tool\windows_amd64 set GOVERSION=go1.16.4 set GCCGO=gccgo set AR=ar set CC=gcc set CXX=g++ set CGO_ENABLED=1 set GOMOD=C:\---\go.mod set CGO_CFLAGS=-g -O2 set CGO_CPPFLAGS= set CGO_CXXFLAGS=-g -O2 set CGO_FFLAGS=-g -O2 set CGO_LDFLAGS=-g -O2 set PKG_CONFIG=pkg-config set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\---\AppData\Local\Temp\go-build71564379=/tmp/go-build -gno-record-gcc-switches
What did you do?
Use the source in "Usage" section of https://github.com/go-gl/glfw, then run:
$ go build -race; ./test
What did you expect to see?
The program correctly running, as if -race
was not provided.
What did you see instead?
ERROR: ThreadSanitizer failed to allocate 0x0000005c9000 (6066176) bytes at 0x200dc940a0000 (error code: 87)
The error appears on each run. The count, address and error code are always the same. Using different program, the count and address are different.
The race detector uses ~10x the memory that a run without -race
uses. Are you actually running out of memory? How much memory does a regular run use?
It is possible that the race detector can't allocate memory because the address space it wants is being used by something else.
I'm not sure exactly what would cause that, but if your code mmap
s or dlopen
s lots of stuff, that might be a contributing factor.
The task manager show less then 10Mb of use for the simple example. Looking to GLFW source code, mmap
is used for some small bitmaps and there are some dlopen
, just to OpenGL and related libraries, but it seems to me that they are not used in Windows (LoadLibraryA
is used instead, I don't know if it is the same). Moreover, error code 87 is "the parameter is incorrect" under Windows.
Hm, I don't know then. You have officially overreached my Windows knowledge :(
cc @dvyukov @bufflig
I too am seeing this on my project. I need to confirm, but I think I wasn't seeing the issue with 1.16.3, but then hit it with 1.16.4
Also seeing this with fkie-cad/yapscan on windows only. That being said, I do do a bunch of c-stuff and reading remote process memory.
go version go1.16.4 windows/amd64
I too am seeing this on my project. I need to confirm, but I think I wasn't seeing the issue with 1.16.3, but then hit it with 1.16.4
Sorry, scratch that. I do see with 1.16.3 too. Go version bump coincided with other toolchain upgrades
cc @zx2c4
The problem is still present in 1.16.5.
Also affects 1.17beta1.
Still present in 1.17rc1
cc @dvyukov
I don't have a windows machine, so I can't debug it. Why can VirtualAlloc return 87? I thought maybe it's because size 0x5c9000 is not a multiple of allocation granularity (64k), but the msdn page says that in such cases the size is simply rounded up.
Is the MEM_LARGE_PAGES flag used? In that case the size needs to be a multiple of GetLargePageMinimum
.
- Include the MEM_LARGE_PAGES value when calling the VirtualAlloc function. The size and alignment must be a multiple of the large-page minimum.
Source: https://docs.microsoft.com/en-us/windows/win32/memory/large-page-support
I don't know if it's relevant, but according to this message in haskell forum: "this means that address given to VirtualAlloc is either not reserved yet, or that size is too big (ie the block-to-be-committed isn't inside one VirtualAlloc MEM_RESERVE block)". Another source point to ASLR, but this should not be the case since the problem is always with the same address. This IBM bug report states: Windows API VirtualAlloc is requesed to allocate memory ofsize 64KB with flag MEM_LARGE_PAGES. This is a non-standard allocation and the API fails with an error "The parameter is incorrect".
MEM_LARGE_PAGES does not seem to be used in compiler-rt/* dir: https://github.com/llvm/llvm-project/search?q=MEM_LARGE_PAGES
This function matches the error message. There is an exception for Windows AddressSanitizer not applied to Go but it seems not related.
It also seems like the allocation is done with a fixed address?
Even if that is not the core of the issue in this case (since it seems to always happen), wouldn't that potentially lead to pseudorandom failures if ASLR decides to load a DLL there?
It also seems like the allocation is done with a fixed address?
Yes, with a fixed address. Well, first, tsan needs to allocate memory at a fixed address. So it's not that it's done for no reason, nor that it's possible to just remove the address. Second, on linux tsan will avoid regions where kernel can load anything and it reserves the remaining regions, so that even user mmap's won't happen at these addresses. I don't remember what happen on windows.
I see. Thanks for clarifying. :)
Thought I'd give some data points:
It runs fine if I cross compiled from linux with (GCC) 9.3-posix 20200320
It does not run if I used TDM GCC 9.2 or 10.3
It also runs fine with (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 8.1.0
Looks like it might be specific to TDM GCC
EDIT:
Running with gdb, it crashes at:
racecall(&__tsan_map_shadow, start, size, 0, 0)
https://github.com/golang/go/blob/go1.16.6/src/runtime/race.go#L393 https://github.com/golang/go/blob/go1.16.6/src/runtime/race_amd64.s#L413
This also happens with MSYS2 GCC.
$ gcc -v
Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=c:/Ignazio/Lavoro/IgnPack/Go/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/10.3.0/lto-wrapper.exe Target: x86_64-w64-mingw32 Configured with: ../gcc-10.3.0/configure --prefix=/mingw64 --with-local-prefix=/mingw64/local --build=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --with-native-system-header-dir=/mingw64/x86_64-w64-mingw32/include --libexecdir=/mingw64/lib --enable-bootstrap --enable-checking=release --with-arch=x86-64 --with-tune=generic --enable-languages=c,lto,c++,fortran,ada,objc,obj-c++,jit --enable-shared --enable-static --enable-libatomic --enable-threads=posix --enable-graphite --enable-fully-dynamic-string --enable-libstdcxx-filesystem-ts=yes --enable-libstdcxx-time=yes --disable-libstdcxx-pch --disable-libstdcxx-debug --enable-lto --enable-libgomp --disable-multilib --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-libiconv --with-system-zlib --with-gmp=/mingw64 --with-mpfr=/mingw64 --with-mpc=/mingw64 --with-isl=/mingw64 --with-pkgversion='Rev5, Built by MSYS2 project' --with-bugurl=https://github.com/msys2/MINGW-packages/issues --with-gnu-as --with-gnu-ld --with-boot-ldflags='-pipe -Wl,--dynamicbase,--high-entropy-va,--nxcompat,--default-image-base-high -Wl,--disable-dynamicbase -static-libstdc++ -static-libgcc' 'LDFLAGS_FOR_TARGET=-pipe -Wl,--dynamicbase,--high-entropy-va,--nxcompat,--default-image-base-high' --enable-linker-plugin-flags='LDFLAGS=-static-libstdc++\ -static-libgcc\ -pipe\ -Wl,--dynamicbase,--high-entropy-va,--nxcompat,--default-image-base-high\ -Wl,--stack,12582912' Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.3.0 (Rev5, Built by MSYS2 project)
Having the same issue when trying running make test
on azure pipelines. Win 2019 vm, GCC 11.2, Go 1.16.3. Has anyone tried with earlier/newer versions of Go and does that work? And the address each run is different in my case..
Having the same issue when trying running
make test
on azure pipelines. Win 2019 vm, GCC 11.2, Go 1.16.3. Has anyone tried with earlier/newer versions of Go and does that work? And the address each run is different in my case..
I didn't have this problem with 1.16.3 , but was running musl GCC 9.2.1 mingw64 toolchain . When I upgraded Go to 1.16.4 I upgraded the toolchain to GCC 10.X at the same time. That's when I started having problems. So I think the GCC version is the significant thing, not the Go version. Unfortunately I haven't been able to try with >1.16.3 and a 9.2.1 toolchain
I feel like I have a basic understanding of two parts of this issue, but not how they fit together. Please correct anything I've gotten wrong.
- The tsan runtime is implemented in
compiler-rt
which is part of the LLVM project. That is built into a.syso
file using a specific version of Go and LLVM and that.syso
file is then incorporated into binaries built by Go when the-race
flag is included. In go code that uses cgo, the tsan runtime is calling a Windows API function with invalid parameters, which causes the panics in the logfile. - Per this issue, the failure appears to happen across a variety of Go versions (Go>1.16.3?), but with specific gcc versions (gcc<10.0.0?).
I don't understand how those two parts are linked. Here's my guess:
When building with -race
, Go must modify or wrap cgo bytecode to play nicely with the race detector. That bytecode is produced by the installed gcc, manipulated by the Go compiler, and then must run in an environment managed by tsan. My guess is that Go makes some assumptions about the produced code that are no longer true in newer gcc versions.
The raceinit
function @AlexRouSg pointed to calls __tsan_map_shadow
after rounding its size parameter to a page. __tsan_map_shadow calls MapShadow. That function gets the actual page size (dwPageSize via GetSystemInfo) and further rounds the start and end points of the region. It then calls MmapFixedSuperNoReserve which calls directly to MmapFixedNoReserve. MmapFixedSuperNoReserve has a "FIXME" comment about using large-page support, but it seems like an invitation to an optimization, not a potential bug. On the first call, MapShadow also calls MmapFixedSuperNoReserve for the "data" segment, with explicit 64k alignment. Since the values in the error messages aren't 64k aligned, I think that's not the problematic call.
==5736==ERROR: ThreadSanitizer failed to allocate 0x000000909000 (9474048) bytes at 0x200dd4b374000 (error code: 87)
both of those values are 4k-aligned (and it looks like 4k is the page size on windows).
According to the docs for VirtualAlloc alignment shouldn't even be necessary - it rounds as necessary, except in the case of MEM_LARGE_PAGES
which isn't in use here.
This is all related to point 1 above. I don't have a good way to start looking at point 2.
So, a few possibilities here:
- MS docs are wrong and 64k alignment is required (the comments in the llvm files suggest this!)
- This memory is already mapped somehow, in a way that makes it invalid (but, I think remapping is OK..)
That first possibility sounds relevant.. maybe gcc used to produce 64k-aligned regions in its output, and no longer does?
On msys distribution 20210725 we found that downgrading gcc didn't fix the issue, but downgrading binutils from latest (2.36.1-3
as of our test) to 2.35.1-2
did fix the ThreadSanitizer issue.
On msys distribution 20210725 we found that downgrading gcc didn't fix the issue, but downgrading binutils from latest (
2.36.1-3
as of our test) to2.35.1-2
did fix the ThreadSanitizer issue.
My gcc is 10.3.0 (tdm64-1). I also met the same problem. According to the comment, I used binutils 2.33.1 and fixed the issue.
go version go1.17.2 windows/amd64 gcc version 8.1.0 (x86_64-posix-seh-rev0, Built by MinGW-W64 project) also OK
I have had the same problem.
==23768==ERROR: ThreadSanitizer failed to allocate 0x0000016a1000 (23728128) bytes at 0x200d9afc00000 (error code: 87)
exit status 66
This problem carried on with 1.17.x version, however, as @huifly said, downgrading binutils helps so far.
I have been using this custom package from MinGW Distro - nuwen.net, version 17.1 that has gcc 9.2.0 and binutils 2.33.1 bundled and it works with go1.17.6 just fine, however, it appears there are some nice scripts that could enable building custom combination of desired tools so one could have all the packages upgraded while maintaining lower version of binutils.
In addition to that, it is a selfextracting package that only requires setting the path properly, so testing and deploying is rather trivial.
After I stripping aslr It works :) on windows that because they tried to allocate at reserved address space above max user address space on amd64
@tylermasci16 could you please explain how can I "strip aslr"? The only thing I found is go build -race -aslr=false
do it with this commit: https://github.com/golang/go/commit/56dac60074698d23dc6acc047e61d2ad59c9610d but seems to work only for c-shared builds.
FWIW, according to https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-, error code: 87
is ERROR_INVALID_PARAMETER
: “The parameter is incorrect.”
(Of course, that doesn't tell us which parameter is incorrect or for what reason, and ThreadSanitizer didn't even have the courtesy to tell us which system call produced the error. 😵)
LE: it was the MinGW version we were using, please check out this follow-up comment.
We've started seeing this exact error in the containerd Windows CI workflow workflow since Friday (10/Jun/22) after switching from Go 1.18.0 to 1.18.3 in this PR.
+ integration ==1400==ERROR: ThreadSanitizer failed to allocate 0x0000025f9000 (39817216) bytes at 0x200dbfc7c0000 (error code: 87) exit status 66
In my debugging attempts so far I have tried the following (all leading to the same failure):
- increasing the VM size on Azure in case memory really was an issue
- reverting all other containerd-side patches we've had since Friday in case those were causing this
- tried switching back to Go 1.18.0 for Windows (this may indicate it may be a bug in 1.18.3 it which may have been backported to 1.18.X)
- currently trying to reproduce on my own machine (all of the above-mentioned test runs were on Azure-hosted VMs)
Random notes:
- seems 100% consistent since Friday (hadn't had a single non-crashing run since)
- all the tests run on Azure VMs spawned from the official Microsoft images which we've been using since March
- we've been installing Go using Chocolatey this whole time as seen here
The error appears on each run. The count, address and error code are always the same. Using different program, the count and address are different.
- I can also confirm that rebuilding the binary leads to different byte allocation count/address FWIW
Go env from the Azure machines used for the tests is the following:
# (Identical for both 2019 and 2022 except for the `debug-prefix-map`):
GO111MODULE=set
GOARCH=amd64set
GOBIN=set
GOCACHE=C:\Users\azureuser\AppData\Local\go-buildset
GOENV=C:\Users\azureuser\AppData\Roaming\go\envset
GOEXE=.exeset
GOEXPERIMENT=set
GOFLAGS=set
GOHOSTARCH=amd64set
GOHOSTOS=windowsset
GOINSECURE=set
GOMODCACHE=C:\Users\azureuser\go\pkg\modset
GONOPROXY=set
GONOSUMDB=set
GOOS=windowsset
GOPATH=C:\Users\azureuser\goset
GOPRIVATE=set
GOPROXY=https://proxy.golang.org,directset
GOROOT=c:\Program Files\Goset
GOSUMDB=sum.golang.orgset
GOTMPDIR=set
GOTOOLDIR=c:\Program Files\Go\pkg\tool\windows_amd64set
GOVCS=set
GOVERSION=go1.18.3set
GCCGO=gccgoset
GOAMD64=v1set
AR=arset
CC=gccset
CXX=g++set
CGO_ENABLED=1set
GOMOD=NULset
GOWORK=set
CGO_CFLAGS=-g -O2set
CGO_CPPFLAGS=set
CGO_CXXFLAGS=-g -O2set
CGO_FFLAGS=-g -O2set
CGO_LDFLAGS=-g -O2set
PKG_CONFIG=pkg-configset
GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\azureuser\AppData\Local\Temp\go-build2049959765=/tmp/go-build -gno-record-gcc-switches
Any additional info or debugging ideas would be much appreciated!
I think you've updated minGW too, that's what's causing the issue.
@ncantelmo that was it, thanks a lot for all the help!
For future reference for anyone else running into this issue with a Chocolatey-installed MinGW:
mingw = "10.2.0"
worksmingw = "10.3.0"
failsmingw = "11.X.0"
all fail (including11.3.0
)
Run into this suddenly this week, 10.3.0 isn't working either. Is there a canonical advised development stack?
Happens with dlv too:
==34372==ERROR: ThreadSanitizer failed to allocate 0x000003061000 (50728960) bytes at 0x200dabe7d0000 (error code: 87)
Very unhelpful error, where is it from? The Go compiler itself?
I was using TDM but it seems dead/unmaintained so I switched to MinGW and that fails too.
Edit:
So this worked for me:
choco install mingw --version 10.2.0 --allow-downgrade
The ERROR_INVALID_PARAMETER
error from VirtualAlloc
is because the base address is too high.
The allocation needs to fit within the minimum and maximum application addresses as provided by GetSystemInfo
.
If https://github.com/llvm/llvm-project/blob/8246b2e156568c31e71e16cbaf4c14d316e7c06e/compiler-rt/lib/tsan/rtl-old/tsan_rtl.cpp#L319 is still correct, then this will be a problem, as in my local failure case it is not 64k aligned:
request: 0x20015511007070000 aligned: 0x20015511007000000
I think that this will be fixed when https://github.com/golang/go/issues/35006 will be closed and the tsan binaries will be recompiled (and updated to v3 like for linux)
If you are building Go from source, please try the new race detector runtime (pending submit) to see if it resolves this issue. From your Go repo on windows:
git fetch https://go.googlesource.com/go refs/changes/97/420197/2 && git checkout FETCH_HEAD
This version of the runtime requires a more up-to-date C compiler version (in particular it requires libsynchronization.a).
With no other apparent change in the toolchain, Go 1.19 seems to solve for me.
With no other apparent change in the toolchain, Go 1.19 seems to solve for me.
maybe it was this?

even though the changes don't affect windows, perhaps some minor dependent change inadvertently fixed the bug...?
Also can confirm 1.19 gets things in order again.. Believe it was this set of patches from @thanm https://github.com/golang/go/commit/0c7fcf6bd1fd8df2bfae3a482f1261886f6313c1 https://github.com/golang/go/commit/eaf21256545ae04a35fa070763faa6eb2098591d
Updated to 1.19 and new error...
Build Error: go test -c -o d:\Work\odin\odin\api\src\services\deal\__debug_bin.exe -gcflags all=-N -l -v -race .
# runtime/cgo
In file included from c:\program files\x86_64-w64-mingw32-native\lib\gcc\x86_64-w64-mingw32\11.2.1\include-fixed\limits.h:34,
from c:\program files\x86_64-w64-mingw32-native\include\stdlib.h:11,
from _cgo_export.c:3:
c:\program files\x86_64-w64-mingw32-native\include\syslimits.h:12:25: error: no include path in which to search for limits.h
12 | #include_next <limits.h>
| ^ (exit status 2)
I just want to debug 1 test 😭
Edit: if I run the command manually, apparently it's not even valid:
❯ go test -c -o d:\Work\odin\odin\api\src\services\deal\__debug_bin.exe -gcflags all=-N -l -v -race .
go: unknown flag -l cannot be used with -c
c:\program files\x86_64-w64-mingw32-native\include\syslimits.h:12:25: error: no include path in which to search for limits.h 12 | #include_next <limits.h> | ^ (exit status 2)
I think this is getting a bit farther afield from the original issue ("ThreadSanitizer failed to allocate"). Seems like maybe something is out of whack with your gcc installation.
❯ go test -c -o d:\Work\odin\odin\api\src\services\deal__debug_bin.exe -gcflags all=-N -l -v -race . go: unknown flag -l cannot be used with -c
That looks as though the "-l" is being interpreted by the Go command and not passed to the compiler. I would try fixing up your quoting, e.g. "-gcflags all=-N -l" etc.
Seems like maybe something is out of whack with your gcc installation.
I forgot this even used gcc haha I'm not sure which version I have installed, I always assumed Go on Windows used the native Microsoft compiler for C code.
I would try fixing up your quoting
none of these are commands I've written, these are just what comes out when I click "debug test" in vscode
I'll experiment a bit more next week, this was working fine for literal years and suddenly one day it all breaks and I can't debug any more, so annoying.