deno
deno copied to clipboard
feat(ext/ffi): Implement FFI fast-call trampoline with Dynasmrt
WIP
We want to wait until we support Uint8Array in FastAPI before landing this one (which should be soon) because it will require a bit more complex C code. I'm sure we can figure out how to also do this in ASM.
Absolutely. There's still work to do to get this PR ready anyway.
Have you given any thought to ARM?
the intent is to implement support for Apple Silicon as well as windows x64, but haven't really delved into it yet. First I want to see how the SysV AMD64 implementation consolidates after the feedback.
Have you given any thought to ARM?
aarch64-apple-darwin
and x86_64-pc-windows-msvc
are now ready to review
@arnauorriols please rebase.
@piscisaureus please review
@arnauorriols please rebase.
On it. Not a trivial rebase though, gonna take me a bit.
@arnauorriols please rebase.
PR rebased
Before merging this PR, a decision must be taken (and documented) on the open question described in the PR summary:
V8 does not currently follow this Apple's policy, and instead aligns all arguments to 8 Byte boundaries. A decision needs to be taken:
- leave it broken and wait for v8 to fix the bug
- Adapt to v8 bug and follow its ABI instead of Apple's. When V8 fixes the implementation, we'll have to fix it here as well
Discussion in https://github.com/denoland/deno/pull/15305#discussion_r963002793.
my ocd is telling me to give this new JIT assembler a name. any suggestions?
Apropos the arm64 macos stack alignment issue: I would suggest to stick with V8's behavior for now. Leave some pointers in the code to make it easy to switch once V8 fixes the bug.
(I may give it a try. I don't have a M1 or M2 but it's straightforward enough in theory.)
Apropos the arm64 macos stack alignment issue: I would suggest to stick with V8's behavior for now. Leave some pointers in the code to make it easy to switch once V8 fixes the bug.
I agree
@bnoordhuis i've tried a few different things including modifying StackSlot
to emit different alignments but that seems to be hitting an issue with the stack pointer not being padded to 8 byte alignment which leads me to think the slot allocator needs to be refactored a bit here... best of luck.