openj9
openj9 copied to clipboard
Changes to the code cache repository allocation code
In order to eliminate helper trampolines, OpenJ9 tries to allocate the code cache in the vicinity of the JIT dll. This intention is signalled by using a preferredStartAddress when calling allocateCodeCacheSegment(). Currently, only x86-64 uses this approach.
This commit implements the following changes:
- For Linux, we increase the search space to almost 2GB. Also, we prefer to start with an approach that uses the
smapspseudofile to find memory ranges where the code cache could be allocated. - For Windows, we keep a smaller memory search space because smaps are not available to speed up the search process.
- We compute a preferred alignment and pass that to the VM. For x86-64 this alignment is 2 MB, i.e. the size of large pages used by the Transparent Huge Page (THP) mechanism. The alignment is relevant only when preferredStartAddress is provided.
- For Linux we provide a hint to the OS (with madvise) that we prefer the usage of THP for the span of the code cache repository.
jenkins test sanity all jdk21
jenkins test xlinux all jdk21
I have changed the code so that chooseCacheStartAddress() picks the start address and allocateCodeCacheSegment() follows that recommendation. It's still a tight coupling in the sense that chooseCacheStartAddress() "knows" that allocateCodeCacheSegment() will use smaps to search for a gap in the address space, and therefore we can afford to search over a larger area.
Added ASSERT_FATAL as suggested. On my machine at home I managed to configure 1GB pages. For some of the runs I get an output like:
#CODECACHE: The code cache repository was allocated between addresses 00007F72F90CF000 and 00007F73390CF000 to avoid helper trampolines. alignment=1073741824 largeCodePageSize=1073741824
#CODECACHE: allocated code cache segment of size 1073741824
#CODECACHE: allocateCodeCacheRepository: size=1073741824 heapBase=00007F72F90CF000 heapAlloc=00007F72F90CF008 heapTop=00007F73390CF000
#CODECACHE: carved size=2097144 range: 00007F72F90CF008-00007F72F92CF000
#CODECACHE: CodeCache allocated 00007F73480CD8E0 @ 00007F72F90CF008-00007F72F92CF000 HelperBase:00007F72F92CE270
The code cache repository is not 1GB aligned and I would like to understand why.
I have replaced the ASSERT_FATAL with an if statement because endAddress can be smaller than startAddress when the size of the codeCache repository is very large and there is no way to fit it in the vicinity of the JIT dll. If that happens, we will just let the OS pick any address it wants.
I have also tracked down the behavior with 1 GB large pages that were not aligned: when large pages are enabled the VM uses shmat rather than mmap to allocate memory. The call to
addressKey = shmget(IPC_PRIVATE, (size_t) byteAmount, shmgetFlags);
was failing (I needed to be root) and the VM code proceeded with allocating memory with default pages. When I run as root, the allocation with large pages succeeds and it is aligned properly.
jenkins test sanity xlinux,win jdk21
Looks like both builds failed due to infra issues:
Linux:
19:09:47 Error occurred for request PUT /artifactory/ci-openj9/Build_JDK21_x86-64_linux_Personal/151/test-images.tar.gz;build.parentNumber=513;build.parentName=Pipeline_Build_Test_JDK21_x86-64_linux;build.buildIdentifier=eclipse-openj9%2Fopenj9%2319516;build.timestamp=1716576887416;build.name=Build_JDK21_x86-64_linux_Personal;build.number=151 HTTP/1.1: Broken pipe (Write failed).
Windows:
15:37:34 ERROR: Cannot delete workspace :Unable to delete 'F:\Users\jenkins\workspace\Build_JDK21_x86-64_windows_Personal\openssl\NUL'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
jenkins test sanity xlinux,win jdk21