mps icon indicating copy to clipboard operation
mps copied to clipboard

Issue with Apple silicon write-xor-execute memory requirements

Open tynor opened this issue 3 years ago • 9 comments

Hello,

I greatly appreciate the addition of support for macOS/aarch64 in MPS. For me, MPS builds successfully (using the d894a3f90a commit), but fails when using the VM arena class because MPS requests PROT_EXEC when creating its virtual memory mapping:

The MPS detected a problem!
build/src/mps/code/vmix.c:180: MPS ASSERTION FAILED: errno == ENOMEM
See the "Assertions" section in the reference manual:
https://www.ravenbrook.com/project/mps/master/manual/html/topic/error.html#assertions

I have verified that removing PROT_EXEC from the mmap(2) call gets past that particular issue.

One of the features of M1-based Macs is hardware enforcement of only having one of PROT_WRITE or PROT_EXEC. While Apple provides APIs for managing needing PROT_EXEC on pages, my question is whether MPS actually needs to execute code generated on the fly in the first place. I have not looked through much of the code, but I did notice MPS also calls mprotect(2) with PROT_EXEC but the AccessSet values indicate it is only concerned with being able to write.

I'd be very interested to see if there is a workaround I'm missing as well, since I imagine tests are passing on Apple silicon for someone.

Thank you very much.

tynor avatar Jan 15 '22 01:01 tynor

Thanks for reporting the problem. You may be the first to test the arm64 support on Apple Silicon! I have only tested it on Linux.

First, if I understand Apple's documentation correctly, you should be able to work around this by turning off Hardened Runtime for the application, or by setting the "Allow Unsigned Executable Memory Entitlement". (Let me know if I've misunderstood this.)

Here are some solution ideas for applications that need to turn on Hardened Runtime:

  1. It would be nice to automatically detect the problem and disable the use of PROT_EXEC. (This wouldn't affect backwards-compatibility since no-one can have been using the MPS to manage compiled code objects in Hardened Runtime on Apple Silicon yet, precisely because of this issue.)

    But can we detect the right set of conditions? We need to be on Apple Silicon, have the Hardened Runtime capability, but not the Allow Unsigned Executable Memory Entitlement. Are there API calls to detect this condition? Is this even the right set of conditions? It seems risky to try to implement this kind of thing without expertise. Perhaps we could try an initial mmap(..., PROT_WRITE | PROT_EXEC, ...) and see if it fails with — what value of errno? The man page does not say!

    Also, this doesn't solve the problem of how to manage compiled code objects in the MPS.

  2. Add an option somewhere, for example on mps_arena_create_k(), that disables the use of PROT_EXEC throughout the arena.

    This is less risky than (1) but not as convenient as it passes the problem to the MPS client. It also doesn't solve the problem of how to manage compiled code objects in the MPS.

  3. In combination with (1) or (2), add a mechanism, perhaps at the pool level, for supporting MAP_JIT. This will need some thought because (i) we'd need to update the shield and the fault handler to understand the MAP_JIT mechanism; (ii) at the moment the MPS assumes that all mapped pages are essentially identical, so that the spare page mechanism can ArenaFree a page from one pool and pass it to another pool via ArenaAlloc. Can we set the MAP_JIT flag by another call to mmap() passing MAP_FIXED | MAP_JIT and clear it similarly? If so, probably worth it to hang on to the memory.

    We expect that few people need movable code objects or scannable code objects, so maybe the simplest thing that would work would be to provide an option to mps_pool_create_k() allowing the caller to specify additional MAP_ flags that will be or-ed into the mmap() call, and support this only on Leaf Only (LO) pools. The client would be responsible for everything else.

gareth-rees avatar Jan 15 '22 11:01 gareth-rees

Thanks for the quick response. A couple more thoughts:

After some poking around, I found the code .Net uses to detect the hardened runtime. Not exactly the cleanest, and does not check for the entitlement.

If calling mmap(2) again does not work to manipulate MAP_JIT, would unmapping then immediately remapping MAP_FIXED (with/without MAP_JIT respectively) at the same address work?

I wouldn't be opposed to going with option 1 in the short term, and punting support for code objects on Apple silicon until the need arises. Though missing functionality like that on a given platform target might not be acceptable.

I do like the combination of being able to disable PROT_EXEC via an arena argument, then being able to re-enable it along with MAP_JIT for a specific pool. Though supporting MAP_JIT within MPS itself sounds like a headache, and platform specific, unfortunately.

tynor avatar Jan 15 '22 22:01 tynor

Some context that probably further complicates matters (I'm sorry): In my language that uses the MPS, I store code objects in pools managed by the MPS. As these contain references to other code objects (e.g. for function calls), I store them in a regular AMC pool. Sadly, this means that only allowing the MAP_JIT on LO pools is not really an option for me currently. On the other hand, my usage is probably a bit niche. Also, I do not support Apple Silicon at the moment, and I don't mind too much maintaining a (slight) fork that modifies the calls to mmap() as necessary if I decide to make a port for Apple Silicon.

fstromback avatar Jan 19 '22 09:01 fstromback

Fascinating. Storm is exactly the sort of use case we had in mind for the MPS and it's good to see you using it there. Would you be readily able to allocate your code objects in a separate pool from other objects? Perhaps an instance of a distinct pool class, or an instance of an existing pool class created with a pool creation option? I must have a play with Storm some time.

NickBarnes avatar Jan 19 '22 17:01 NickBarnes

Thank you for the kind words! I am actually using a separate pool for the code allocations already, so that is not an issue. I am currently using the amc class for that pool as well, as it seemed to best fit my needs at the time. I do realize that it might not be optimal, as access patterns for code is likely quite different from other data (e.g. code objects are generally long lived compared to data). (If you are interested, the MPS integration is in the file Gc/MPS/Impl.cpp in the repository available at git://storm-lang.org/storm.git)

fstromback avatar Jan 19 '22 18:01 fstromback

Right now on Apple Silicon, the MPS "doesn't have a configuration for this platform out of the box".

I use an M1 MacBook as my main development machine, so I would at least like basic support.

I am willing to make a PR to:

  1. Add the proper support for this to mpstd.h (make it a supported
  2. On apple silicon platforms (and apple silicon only) disable the PROT_EXEC flag pending a "proper" solution
    • This would be behind a #if defined(__APPLE__) && defined(__arm64__), so it wouldn't affect other platforms

Putting PROT_EXEC behind an #ifdef is not be the most elegant solution. However, I think it's the minimum viable patch to get things working on Apple Silicon. Right now the current version of MPS has no working support for M1 Macs, so this is strictly an improvement 😉

Would you be willing to accept a PR with this minimal support that I have described above? I already have a draft that I have written on my (apple silicon) laptop.

It does not affect other platforms, because everything Apple-silicon specific is behind an #ifdef.

Long term fix

I think as a more long term solution, clients that need support for PROT_EXEC on Apple Silicon platforms should explicitly request it. This is really just a reflection of apple requirements so the MPS isn't really adding any additional requirements here, it's just passing on the requirements to clients. Clients will have to use the new apple pthread_jit_write_protect_np API anyways, so we really aren't shifting that much on a burden onto clients.

However we don't need to block short term support for apple silicon on any of these decisions. I can make a patch tomorrow with the basic support that I have described above.

Also Apple is not the only platform that is strict about W^X. OpenBSD has had mandatory requirements for W^X, so this problem was bound to crop up eventually :)

Techcable avatar Jan 30 '22 09:01 Techcable

Okay I have a minimal viable patch in #77 that simply disables PROT_EXEC flag on Apple Silicon.

It will not work for @fstromback 's use case of using the MPS to manage executable memory. However, supporting that properly is likely going to require much more work, and very detailed integration with Apple's new JIT API 😦

I think this is good enough to merge as is, and it will work fine as long as you don't use the MPS to manage executable memory (the scheme interpreter compiles and runs with it).

Techcable avatar Jan 30 '22 09:01 Techcable

This sounds like a good approach to me!

Having to request executable memory in the future also seems reasonable to do, as it is a bit of a niche use-case. If this API was designed to encourage R^X, that would also be nice, but I am not entirely sure how to do that nicely.

fstromback avatar Jan 30 '22 14:01 fstromback

I have been thinking a bit more about this. I have a rough idea for how to implement support for write XOR execute in a way that works for what I need, and that I think would work for a future port to Apple Silicon as well. I will try to implement this idea and make a PR for it as a starting point for exploring possible designs. If I get it to work it would be nice to enforce write XOR execute in my language even on platforms where it is not required.

fstromback avatar Feb 26 '22 21:02 fstromback