FEX
FEX copied to clipboard
Allocator: Simplify StealMemory, make it less chatty with kernel space
Overview
This parses /proc/self/maps
with a very tiny FSM, and allocates all unmapped regions.
Performance
This makes /bin/true
run in < 10-15 msecs from 38-43 with #1842
Follow ups
- #1914
While this is faster, it doesn't actually fill all that holes. Which can cause issues. main:
800000000000-801000000000 rw-p 00000000 00:00 0 <- Tracking buffer
801000000000-aaaac1240000 ---p 00000000 00:00 0 <- Most of the allocation range
aaaac1240000-aaaac12cb000 r--p 00000000 00:3d 39038 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaac12cb000-aaaac12da000 ---p 00000000 00:00 0
aaaac12da000-aaaac1563000 r-xp 0008a000 00:3d 39038 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaac1563000-aaaac1572000 ---p 00000000 00:00 0
aaaac1572000-aaaac159c000 r--p 00312000 00:3d 39038 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaac159c000-aaaac15ab000 ---p 00000000 00:00 0
aaaac15ab000-aaaac15ad000 rw-p 0033b000 00:3d 39038 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaac15ad000-aaaac18f5000 rw-p 00000000 00:00 0
aaaac18f5000-ffff7f400000 ---p 00000000 00:00 0 [heap]
ffff7f400000-ffff7fc00000 rw-p 00000000 00:00 0
ffff7fc00000-ffff7fd30000 ---p 00000000 00:00 0
ffff7fd30000-ffff7feb9000 r-xp 00000000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff7feb9000-ffff7fec8000 ---p 00189000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff7fec8000-ffff7fecc000 r--p 00188000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff7fecc000-ffff7fece000 rw-p 0018c000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff7fece000-ffff7feda000 rw-p 00000000 00:00 0
ffff7feda000-ffff7fee0000 ---p 00000000 00:00 0
ffff7fee0000-ffff7fef4000 r-xp 00000000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff7fef4000-ffff7ff03000 ---p 00014000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff7ff03000-ffff7ff04000 r--p 00013000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff7ff04000-ffff7ff05000 rw-p 00014000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff7ff05000-ffff7ff10000 ---p 00000000 00:00 0
ffff7ff10000-ffff7ff96000 r-xp 00000000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff7ff96000-ffff7ffa5000 ---p 00086000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff7ffa5000-ffff7ffa6000 r--p 00085000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff7ffa6000-ffff7ffa7000 rw-p 00086000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff7ffa7000-ffff7ffb0000 ---p 00000000 00:00 0
ffff7ffb0000-ffff801b9000 r-xp 00000000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff801b9000-ffff801c9000 ---p 00209000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff801c9000-ffff801d4000 r--p 00209000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff801d4000-ffff801d7000 rw-p 00214000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff801d7000-ffff801da000 rw-p 00000000 00:00 0
ffff801da000-ffff801f5000 ---p 00000000 00:00 0
ffff801f5000-ffff80220000 r-xp 00000000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffff80220000-ffff80222000 ---p 00000000 00:00 0
ffff80222000-ffff8022c000 rw-p 00000000 00:00 0
ffff8022c000-ffff8022e000 r--p 00000000 00:00 0 [vvar]
ffff8022e000-ffff8022f000 r-xp 00000000 00:00 0 [vdso]
ffff8022f000-ffff80231000 r--p 0002a000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffff80231000-ffff80233000 rw-p 0002c000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffff80233000-ffffcc9f4000 ---p 00000000 00:00 0 <-- Stack growth region that we consume
ffffcc9f4000-ffffcca15000 rw-p 00000000 00:00 0 [stack]
ffffcca15000-1000000000000 ---p 00000000 00:00 0 <- Final remaining 821MB
This PR
800000000000-aaaabfe80000 ---p 00000000 00:00 0 <-- Majority of region theft
aaaabfe80000-aaaabff0b000 r--p 00000000 00:3d 39801 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaabff0b000-aaaabff1a000 ---p 00000000 00:00 0
aaaabff1a000-aaaac01a3000 r-xp 0008a000 00:3d 39801 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaac01a3000-aaaac01b2000 ---p 00000000 00:00 0
aaaac01b2000-aaaac01db000 r--p 00312000 00:3d 39801 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaac01db000-aaaac01ea000 ---p 00000000 00:00 0
aaaac01ea000-aaaac01ec000 rw-p 0033a000 00:3d 39801 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaac01ec000-aaaac0535000 rw-p 00000000 00:00 0
aaaac0535000-ffff96000000 ---p 00000000 00:00 0 [heap]
ffff96000000-ffff96800000 rw-p 00000000 00:00 0
ffff96800000-ffff968a0000 ---p 00000000 00:00 0
ffff968a0000-ffff96a29000 r-xp 00000000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff96a29000-ffff96a38000 ---p 00189000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff96a38000-ffff96a3c000 r--p 00188000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff96a3c000-ffff96a3e000 rw-p 0018c000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff96a3e000-ffff96a4a000 rw-p 00000000 00:00 0
ffff96a4a000-ffff96a50000 ---p 00000000 00:00 0
ffff96a50000-ffff96a64000 r-xp 00000000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff96a64000-ffff96a73000 ---p 00014000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff96a73000-ffff96a74000 r--p 00013000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff96a74000-ffff96a75000 rw-p 00014000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff96a75000-ffff96a80000 ---p 00000000 00:00 0
ffff96a80000-ffff96b06000 r-xp 00000000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff96b06000-ffff96b15000 ---p 00086000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff96b15000-ffff96b16000 r--p 00085000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff96b16000-ffff96b17000 rw-p 00086000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff96b17000-ffff96b20000 ---p 00000000 00:00 0
ffff96b20000-ffff96d29000 r-xp 00000000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff96d29000-ffff96d39000 ---p 00209000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff96d39000-ffff96d44000 r--p 00209000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff96d44000-ffff96d47000 rw-p 00214000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff96d47000-ffff96d4a000 rw-p 00000000 00:00 0
ffff96d4a000-ffff96d64000 ---p 00000000 00:00 0
ffff96d64000-ffff96d8f000 r-xp 00000000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffff96d8f000-ffff96d91000 ---p 00000000 00:00 0
ffff96d91000-ffff96d9b000 rw-p 00000000 00:00 0
ffff96d9b000-ffff96d9d000 r--p 00000000 00:00 0 [vvar]
ffff96d9d000-ffff96d9e000 r-xp 00000000 00:00 0 [vdso]
ffff96d9e000-ffff96da0000 r--p 0002a000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffff96da0000-ffff96da2000 rw-p 0002c000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
#### HOLE!
fffffd7e1000-fffffd802000 rw-p 00000000 00:00 0 [stack]
#### HOLE!
Hmm, for ffff96da2000 - fffffd7e1000
maybe they are stack growth pages?
no idea why fffffd802000 - 1000000000000
are not allocated though. Maybe they are inaccessible without MAP_FIXED?
@Sonicadvance1tested here locally and all seemed OK
(note to self: There's also the 32-bit side of things that needs to be looked into here)
Submodule fmt needs to stop being updated in this
This still generates a memory map that has holes in the 64-bit space.
800000000000-aaaab6150000 ---p 00000000 00:00 0
aaaab6150000-aaaab61db000 r--p 00000000 00:3a 59076 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaab61db000-aaaab61ea000 ---p 00000000 00:00 0
aaaab61ea000-aaaab6473000 r-xp 0008a000 00:3a 59076 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaab6473000-aaaab6482000 ---p 00000000 00:00 0
aaaab6482000-aaaab64ac000 r--p 00312000 00:3a 59076 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaab64ac000-aaaab64bb000 ---p 00000000 00:00 0
aaaab64bb000-aaaab64bd000 rw-p 0033b000 00:3a 59076 /mnt/Work/Work/work/FEXNew/Build_MBP/Bin/FEXInterpreter
aaaab64bd000-aaaab6805000 rw-p 00000000 00:00 0
### HOLE
ffff96800000-ffff97000000 rw-p 00000000 00:00 0
### HOLE
ffff97170000-ffff972f9000 r-xp 00000000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff972f9000-ffff97308000 ---p 00189000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff97308000-ffff9730c000 r--p 00188000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff9730c000-ffff9730e000 rw-p 0018c000 08:02 1835131 /usr/lib/aarch64-linux-gnu/libc.so.6
ffff9730e000-ffff9731a000 rw-p 00000000 00:00 0
### HOLE
ffff97320000-ffff97334000 r-xp 00000000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff97334000-ffff97343000 ---p 00014000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff97343000-ffff97344000 r--p 00013000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
ffff97344000-ffff97345000 rw-p 00014000 08:02 1835030 /usr/lib/aarch64-linux-gnu/libgcc_s.so.1
### HOLE
ffff97350000-ffff973d6000 r-xp 00000000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff973d6000-ffff973e5000 ---p 00086000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff973e5000-ffff973e6000 r--p 00085000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
ffff973e6000-ffff973e7000 rw-p 00086000 08:02 1835214 /usr/lib/aarch64-linux-gnu/libm.so.6
### HOLE
ffff973f0000-ffff975f9000 r-xp 00000000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff975f9000-ffff97609000 ---p 00209000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff97609000-ffff97614000 r--p 00209000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff97614000-ffff97617000 rw-p 00214000 08:02 1839082 /usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30
ffff97617000-ffff9761a000 rw-p 00000000 00:00 0
### HOLE
ffff9763d000-ffff97668000 r-xp 00000000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
### HOLE
ffff9766a000-ffff97674000 rw-p 00000000 00:00 0
ffff97674000-ffff97676000 r--p 00000000 00:00 0 [vvar]
ffff97676000-ffff97677000 r-xp 00000000 00:00 0 [vdso]
ffff97677000-ffff97679000 r--p 0002a000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffff97679000-ffff9767b000 rw-p 0002c000 08:02 1835034 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
### HOLE
ffffc9763000-ffffc9784000 rw-p 00000000 00:00 0 [stack]
### HOLE
Didn't pass CI, can't test.
Tested this with the latest changes and it looks like it doesn't leave any holes anymore. Tested on both x86-64 and aarch64 host. I feel like I looked up why the final page couldn't be allocated on x86-64 before and I had to look it up again.
* User space process size. This is the first address outside the user range.
* There are a few constraints that determine this:
*
* On Intel CPUs, if a SYSCALL instruction is at the highest canonical
* address, then that syscall will enter the kernel with a
* non-canonical return address, and SYSRET will explode dangerously.
* We avoid this particular problem by preventing anything
* from being mapped at the maximum canonical address.
*
* On AMD CPUs in the Ryzen family, there's a nasty bug in which the
* CPUs malfunction if they execute code from the highest canonical page.
* They'll speculate right off the end of the canonical space, and
* bad things happen. This is worked around in the same way as the
* Intel problem.
So that's a cool bug.
But now it looks like all the holes in memory are filled on both x86-64 and 48-bit VA AArch64, for both 32-bit and 64-bit guest applications.
Additionally with some microbenching,
AArch64 48-bit VA allocation improved from ~18106 us to ~259 us, a ~70x reduction in time. AArch64 >32-bit VA allocation improved from ~83069 us to ~456 us, a ~181x reduction in time. Roughly the same on x86-64 host.
I still need to go over the implementation details on this, but it's a good perf increase. Good job!
Split the follow up work in #1914
@Sonicadvance1 done