"Cannot allocate executable pages" on macOS/arm64 if GC_set_pages_executable(1)
While working on a macOS arm64 native build of my project, binaries are failing with the following error : "Cannot allocate executable pages".
So I built gctest ( as per the instructions, using the latest sources from the master branch) and it also fails with the same error.
% ./gctest
Cannot allocate executable pages
zsh: abort ./gctest
% file gctest
gctest: Mach-O 64-bit executable arm64
When built for x64 using the same version of bdwgc on the same machine, binaries are running as expected (under rosetta, presumably). In my project, bdwgc is a static build.
% gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 13.0.0 (clang-1300.0.29.3)
Target: arm64-apple-darwin21.1.0
Thread model: posix
Any ideas what the issue may be? Perhaps there's some config I can tweak? Thanks!
Do you use bdwgc master?
Is USE_MMAP_ANON defined or not? Try the opposite, because GC_unix_mmap_get_mem logic is different.
The difference in gcconfig.h between arm64 and x64: MPROTECT_VDB is not defined for arm64. I don't think it will help, but try passing -D MPROTECT_VDB to CFLAGS_EXTRA for arm64 build.
I originally tried the 8.2.0 release, and also the latest master (today). Same result.
Well, turning off USE_MMAP_ANON gives a different error when running gctest now :
GC Warning: Out of memory - trying to allocate requested amount (8224 bytes)...
Insufficient memory for GC_all_nils
Do you think it may be related to mmap issues other projects have encountered, eg. https://github.com/nodejs/node/issues/37061#issuecomment-774175983. ?
Do you think it may be related to mmap issues other projects have encountered, eg. nodejs/node#37061 (comment). ?
Probably. What's the solution possible? Apple docs says about MAP_JIT and com.apple.security.cs.allow-unsigned-executable-memory key.
It looks like we need a fix similar to that in Chromium. Will it help? You can quickly try it by adding MAP_JIT
Applying MAP_JIT, the entitlements, and code-signing the app results in a bus error on the first write to memory.
Here is a reproducible example :
#include <stdio.h>
#include <sys/mman.h>
int main(){
int N=5;
int *ptr = mmap ( NULL, N*sizeof(int), PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS , 0, 0 );
if ( ptr == MAP_FAILED ) {
printf("Mapping Failed\n");
return 1;
}
for(int i=0; i<N; i++) {
ptr[i] = i*10;
}
for(int i=0; i<N; i++)
printf("[%d] ",ptr[i]);
printf("\n");
int err = munmap(ptr, 10*sizeof(int));
if(err != 0){
printf("UnMapping Failed\n");
return 1;
}
return 0;
}
Built for x64, it runs without a problem : gcc -arch x86_64 -o test test.c
However, the native build (arm64) does not work : gcc -o mem_test mem_test.c
If we add the MAP_JIT flag, we are also meant to use specify an entitlement, which is applied to the file when code-signing.
entitlements.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.cs.allow-jit</key>
<true/>
<key>com.apple.security.cs.allow-unsigned-executable-memory</key>
<true/>
</dict>
</plist>
Apply code signing, with runtime hardening
codesign --deep -f -s $APP_ID --entitlements ./entitlements.plist -o runtime test
The result is a bus error. The same error in gctest. Here's a snippet of the crash info from gctest
Termination Reason: Namespace SIGNAL, Code 10 Bus error: 10
Terminating Process: exc handler [53409]
VM Region Info: 0x102b08000 is in 0x102b08000-0x102b18000; bytes after start: 0 bytes before end: 65535
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
MALLOC metadata 102b04000-102b08000 [ 16K] rw-/rwx SM=ZER
---> VM_ALLOCATE 102b08000-102b18000 [ 64K] rwx/rwx SM=PRV
GAP OF 0x44ae8000 BYTES
MALLOC_TINY 147600000-147700000 [ 1024K] rw-/rwx SM=PRV
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_platform.dylib 0x1a4613eac _platform_memset + 108
1 gctest 0x10285e7f8 GC_init_headers + 124
2 gctest 0x102854944 GC_init + 940
3 gctest 0x102853ef8 main + 56
4 dyld 0x102a290f4 start + 520
There's an interesting comment here : https://github.com/zherczeg/sljit/issues/99 Which refers to an Apple document : https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon?preferredLanguage=occ
This implies that on macOS Apple Silicon (arm64) you cannot set memory to PROT_WRITE and PROT_EXEC at the same time.
Indeed. If you modify the example above to exclude PROT_EXEC, the example runs without a problem on arm64 - with or without code-signing / runtime hardening.
If I set NO_EXECUTE_PERMISSION in bdwgc for gctest, it appears to run okay.
What are the consequences of not having PROT_EXEC memory for bdwgc ?
f I set NO_EXECUTE_PERMISSION in bdwgc for gctest, it appears to run okay. What are the consequences of not having PROT_EXEC memory for bdwgc ?
NO_EXECUTE_PERMISSION is defined by default (all provided build scripts). This is done for performance reasons. The clients could overwrite this by GC_set_pages_executable(1) before GC_INIT() if they need to allocate executable memory.
According to this issue, clients on macOS/arm64 cannot allocate executable memory as of the current bdwgc version. I'm not aware if any of known bdwgc clients are affected.
This implies that on macOS Apple Silicon (arm64) you cannot set memory to PROT_WRITE and PROT_EXEC at the same time.
Implementing a workaround in bdwgc is not trivial (and involves proposing new API for clients), the usage could be like this, just a guess:
p = GC_generic_or_special_malloc(size, PTRFREE | GC_OBJ_EXEC); // p will occupy 1 or more pages with MAP_JIT
... // write code to p
GC_end_stubborn_change(p); // informs GC to change pages of p from PROT_WRITE to PROT_EXEC (e.g. call pthread_jit_write_protect_np(true))
... // execute code in p
// GC_change_stubborn(p); // inform GC that the client wants to modify p (e.g. call pthread_jit_write_protect_np(false))
... // update code in p
GC_end_stubborn_change(p);
If someone needs this functionality, patches are welcomed.
The clasp project (https://github.com/clasp-developers/clasp.git) is a Boehm client that needs this functionality. I'm looking into what would be required to implement it.
I found a way around it. I don't use the GC to allocate large blocks to put JITted code into them.
Instead I use mmap with MAP_JIT and the bdwgc GC_add_roots(...) function to tell bdwgc what parts of that memory contain roots.
This follows the instructions for porting JIT compilers to Apple Silicon:
https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon?preferredLanguage=occ
Got it. Thank you for the W/A idea.
Is there much of a cost to using GC_add_roots(...) - once we build our entire system there will be about 12,000 blocks of roots. I figure it can't be worse than what I was doing before, which was having bdwgc manage 12,000 blocks of code and data, most of which did not contain roots. I was very impressed by how well bdwgc handled that.
No worries - I'll find out.
I found out - there is a hard limit of 8192 root sets (using #define LARGE_CONFIG).
I will investigate turning that into a dynamic array. I'll need more root sets than that if I want this workaround to work.
Okay. It is better to open another issue for this.
I reproduced it myself. Host: Darwin gcc104.fsffrance.org 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:20:05 PDT 2022; root:xnu-8020.140.49~2/RELEASE_ARM64_T8101 arm64 Source: master (c94898ba7)
gcc -O0 -g -Wno-deprecated-declarations -I include tests/gctest.c extra/gc.c && ./a.out
Cannot allocate executable pages
Abort trap: 6
If we allocate w/o PROT_EXEC, then it works:
gcc -O0 -g -Wno-deprecated-declarations -D NO_EXECUTE_PERMISSION -I include tests/gctest.c extra/gc.c && ./a.out
Completed 6 tests
...