wasmtime
wasmtime copied to clipboard
[meta] Migrate instruction selection to ISLE
This is a meta issue to track the migration from hand-written instruction selection and lowering over to using the ISLE DSL.
As you port lowering for a clif opcode over to ISLE, please check the associated box (or leave a comment, if you don't have edit permissions and I or someone else can check the box for you). Hopefully this will help us focus our porting efforts and finish the migration in a timely manner, as well as avoid stepping on each others toes by having two people accidentally port the same opcode lowerings.
cc @alexcrichton @cfallin @abrown @jlb6740 @uweigand @sparker-arm @akirilov-arm
x86_64 -- DONE!
- [x]
Opcode::Clz - [x]
Opcode::Ctz - [x]
Opcode::Popcnt - [x]
Opcode::Bitrev - [x]
Opcode::IsNull - [x]
Opcode::IsInvalid - [x]
Opcode::Uextend - [x]
Opcode::Sextend - [x]
Opcode::Breduce - [x]
Opcode::Bextend - [x]
Opcode::Ireduce - [x]
Opcode::Bint - [x]
Opcode::Icmp - [x]
Opcode::Fcmp - [x]
Opcode::FallthroughReturn - [x]
Opcode::Return - [x]
Opcode::Call - [x]
Opcode::CallIndirect - [x]
Opcode::Debugtrap - [x]
Opcode::Trapif - [x]
Opcode::Trapff - [x]
Opcode::WideningPairwiseDotProductS - [x]
Opcode::Fadd - [x]
Opcode::Fsub - [x]
Opcode::Fmul - [x]
Opcode::Fdiv - [x]
Opcode::Fmin - [x]
Opcode::Fmax - [x]
Opcode::FminPseudo - [x]
Opcode::FmaxPseudo - [x]
Opcode::Sqrt - [x]
Opcode::Fpromote - [x]
Opcode::FvpromoteLow - [x]
Opcode::Fdemote - [x]
Opcode::Fvdemote - [x]
Opcode::FcvtFromSint - [x]
Opcode::FcvtLowFromSint - [x]
Opcode::FcvtFromUint - [x]
Opcode::FcvtToUint - [x]
Opcode::FcvtToUintSat - [x]
Opcode::FcvtToSint - [x]
Opcode::FcvtToSintSat - [x]
Opcode::IaddPairwise - [x]
Opcode::UwidenHigh - [x]
Opcode::UwidenLow - [x]
Opcode::SwidenHigh - [x]
Opcode::SwidenLow - [x]
Opcode::Snarrow - [x]
Opcode::Unarrow - [x]
Opcode::Bitcast - [x]
Opcode::Fabs - [x]
Opcode::Fneg - [x]
Opcode::Fcopysign - [x]
Opcode::Ceil - [x]
Opcode::Floor - [x]
Opcode::Nearest - [x]
Opcode::Trunc - [x]
Opcode::Load - [x]
Opcode::Uload8 - [x]
Opcode::Sload8 - [x]
Opcode::Uload16 - [x]
Opcode::Sload16 - [x]
Opcode::Uload32 - [x]
Opcode::Sload32 - [x]
Opcode::Sload8x8 - [x]
Opcode::Uload8x8 - [x]
Opcode::Sload16x4 - [x]
Opcode::Uload16x4 - [x]
Opcode::Sload32x2 - [x]
Opcode::Uload32x2 - [x]
Opcode::Store - [x]
Opcode::Istore8 - [x]
Opcode::Istore16 - [x]
Opcode::Istore32 - [x]
Opcode::AtomicRmw - [x]
Opcode::AtomicCas - [x]
Opcode::AtomicLoad - [x]
Opcode::AtomicStore - [x]
Opcode::Fence - [x]
Opcode::FuncAddr - [x]
Opcode::SymbolValue - [x]
Opcode::StackAddr - [x]
Opcode::Select - [x]
Opcode::Selectif - [x]
Opcode::SelectifSpectreGuard - [x]
Opcode::Udiv - [x]
Opcode::Urem - [x]
Opcode::Sdiv - [x]
Opcode::Srem - [x]
Opcode::Umulhi - [x]
Opcode::Smulhi - [x]
Opcode::GetPinnedReg - [x]
Opcode::SetPinnedReg - [x]
Opcode::Vconst - [x]
Opcode::RawBitcast - [x]
Opcode::Shuffle - [x]
Opcode::Swizzle - [x]
Opcode::Insertlane - [x]
Opcode::Extractlane - [x]
Opcode::ScalarToVector - [x]
Opcode::Splat - [x]
Opcode::VanyTrue - [x]
Opcode::VallTrue - [x]
Opcode::VhighBits - [x]
Opcode::Iconcat - [x]
Opcode::Isplit - [x]
Opcode::TlsValue - [x]
Opcode::SqmulRoundSat - [x]
Opcode::Uunarrow
aarch64
- [x]
Opcode::Load - [x]
Opcode::Uload8 - [x]
Opcode::Sload8 - [x]
Opcode::Uload16 - [x]
Opcode::Sload16 - [x]
Opcode::Uload32 - [x]
Opcode::Sload32 - [x]
Opcode::Sload8x8 - [x]
Opcode::Uload8x8 - [x]
Opcode::Sload16x4 - [x]
Opcode::Uload16x4 - [x]
Opcode::Sload32x2 - [x]
Opcode::Uload32x2 - [x]
Opcode::Store - [x]
Opcode::Istore8 - [x]
Opcode::Istore16 - [x]
Opcode::Istore32 - [x]
Opcode::StackAddr - [x]
Opcode::AtomicRmw - [x]
Opcode::AtomicCas - [x]
Opcode::AtomicLoad - [x]
Opcode::AtomicStore - [x]
Opcode::Fence - [ ]
Opcode::Select - [ ]
Opcode::Selectif - [ ]
Opcode::SelectifSpectreGuard - [x]
Opcode::Bitselect - [x]
Opcode::Vselect - [ ]
Opcode::Trueif - [ ]
Opcode::Trueff - [x]
Opcode::IsNull - [x]
Opcode::IsInvalid - [x]
Opcode::Copy - [x]
Opcode::Breduce - [x]
Opcode::Ireduce - [x]
Opcode::Bextend - [x]
Opcode::Bmask - [x]
Opcode::Bint - [x]
Opcode::Bitcast - [x]
Opcode::FallthroughReturn - [x]
Opcode::Return - [x]
Opcode::Icmp - [x]
Opcode::Fcmp - [x]
Opcode::Debugtrap - [x]
Opcode::Trap - [x]
Opcode::ResumableTrap - [ ]
Opcode::Trapif - [ ]
Opcode::Trapff - [x]
Opcode::FuncAddr - [x]
Opcode::SymbolValue - [x]
Opcode::Call - [x]
Opcode::CallIndirect - [x]
Opcode::GetPinnedReg - [x]
Opcode::SetPinnedReg - [x]
Opcode::Vconst - [x]
Opcode::RawBitcast - [x]
Opcode::Extractlane - [x]
Opcode::Insertlane - [x]
Opcode::Splat - [x]
Opcode::ScalarToVector - [x]
Opcode::VallTrue - [x]
Opcode::VanyTrue - [x]
Opcode::VhighBits - [x]
Opcode::Shuffle - [x]
Opcode::Swizzle - [x]
Opcode::Isplit - [x]
Opcode::Iconcat - [x]
Opcode::Imax - [x]
Opcode::Umax - [x]
Opcode::Umin - [x]
Opcode::Imin - [x]
Opcode::IaddPairwise - [x]
Opcode::WideningPairwiseDotProductS - [x]
Opcode::Fadd - [x]
Opcode::Fsub - [x]
Opcode::Fmul - [x]
Opcode::Fdiv - [x]
Opcode::Fmin - [x]
Opcode::Fmax - [x]
Opcode::FminPseudo - [x]
Opcode::FmaxPseudo - [x]
Opcode::Sqrt - [x]
Opcode::Fneg - [x]
Opcode::Fabs - [x]
Opcode::Fpromote - [x]
Opcode::Fdemote - [x]
Opcode::Ceil - [x]
Opcode::Floor - [x]
Opcode::Trunc - [x]
Opcode::Nearest - [x]
Opcode::Fma - [x]
Opcode::Fcopysign - [x]
Opcode::FcvtToUint - [x]
Opcode::FcvtToSint - [x]
Opcode::FcvtFromUint - [x]
Opcode::FcvtFromSint - [x]
Opcode::FcvtToUintSat - [x]
Opcode::FcvtToSintSat - [x]
Opcode::IaddIfcout - [x]
Opcode::Iabs - [x]
Opcode::AvgRound - [x]
Opcode::Snarrow - [x]
Opcode::Unarrow - [x]
Opcode::Uunarrow - [x]
Opcode::SwidenLow - [x]
Opcode::SwidenHigh - [x]
Opcode::UwidenLow - [x]
Opcode::UwidenHigh - [x]
Opcode::TlsValue - [x]
Opcode::SqmulRoundSato - [x]
Opcode::FcvtLowFromSint - [x]
Opcode::FvpromoteLow - [x]
Opcode::Fvdemote - [ ] Branches
s390x -- DONE!
- [x] Calls
- [x] Returns
- [x] Traps
- [x] Branches
Subscribe to Label Action
cc @cfallin, @fitzgen
Thus the following users have been cc'd because of the following labels:
- cfallin: isle
- fitzgen: isle
To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.
I'm looking into the remaining issues on s390x. Specifically, I already have patches for branches and traps, which I'll submit shortly. For calls and returns, the issues really are cross-platform - I suspect we'll have to move those to ISLE for all targets at the same time ...
For reference, here's the list of issues I've noticed:
- For traps, the only problem is that (some) traps need to be emitted as safepoints, and there is currently no common-code mechanism to emit safepoints from ISLE. (I notice that a few traps were moved to ISLE on x64 - but they're simply no longer safepoints now, which looks like a bug to me ...) This is straightforward to fix by having the ISLE code hold on to tuples of (instruction, is_safepoint) instead of just instructions, similar to what is already done elsewhere. I have a patch for that.
- For branches, there are two problems. The first is a straightforward bug in
clif.islefixed here: https://github.com/bytecodealliance/wasmtime/pull/3718 . The second is more interesting. Currently, branches are emitted via special logic inlower_branch_group, which gets two list of branches and branch targets as input. The list of branches is not a problem; we only ever emit the first branch of that group, and that can be passed to ISLE just like any other instruction. However, when emitting the branch as machine instruction, we need to use the proper branch targets. There is currently no way to access this in ISLE, and I see no obvious way to pass this list into the ISLE machinery either. I have a patch to make this work by having the branch target list instead be generated when needed via an ISLE constructor calling a new context callbackcurrent_block_branch_targets. This works for me, but has the drawback that the context now needs to keep new global state identifying the current block being emitted. (But maybe that's not really a problem, given that there is already global state identifying the current instruction anyway?) - For both calls and returns, the first issue is that these fundamentally receive variable-argument lists as inputs, and need some way to iterate over these lists in ISLE. These lists are currently represented via a
ValueSlicedata type, but it turns out this doesn't work at all: any use of aValueSliceargument in ISLE code will cause the generated Rust code to fail to compile with borrow-checker errors. This is because every single generated constructor and extractor takes a mutable borrow on the context via a&mut selfargument - and theValueSlicetype refers to context memory and therefore extends the lifetime of that mutable borrow of any constructor or extractor that returns a variable of that type, which means that while aValueSlicevariable is live, no other constructor or extractor can ever be called, making it quite useless. (This would also be a problem when handling other variable-argument opcodes likeload_complex.) I have an experimental patch that uses a pair of aValueListidentifier and an integer offset into the list instead of theValueSlice, which seems to fix this - not sure if this is the preferred solution. With that patch, I can successfully handle returns. - For calls, there are still more fundamental problems. The current code delegates most of the actual instruction emission for calls to the
ABICallertrait and associated implementation (which calls back into target code to get ABI details). This code just usesctx.emitall over the place. Therefore, it is not really usable as part of CLIF processing e.g. via constructors, because CLIF separately buffers instructions, and mixing CLIFemit_insnand contextctx.emitscrambles the sequence (and also confuses the CLIF register mapper because it no longer sees everything). I think the straightforward fix would be to change theABICallerimplementation to emit CLIF instructions instead - but that likely means that all targets have to move calls to ISLE at the same time. (I do not have a patch for this.)
I'm picking up AtomicRMW for AArch64.
I'm going to start working down the x86_64 opcode list from the top -- clz/btz/popcnt/bitrev to start, I think. @abrown let me know what your next plans are and we can make sure not to duplicate work!
Just for the record, I am working on x64's icmp, fcmp and select instructions.
Thanks @abrown. @abrown @cfallin for the record, I'd like to get started with something simple. I'd like to claim fabs and fneg.
I'm looking at loads and stores next, FYI.
I'm picking up AtomicCas and IaddPairwise for aarch64.
@abrown are you still planning to take a look at stores on x64? I'm happy to take those first thing Mon if you haven't started yet, as I've got some isel improvements I want to do that involve them (load-op-store patterns). If you're close then no worries though!
Well, I haven't started stores yet... so take them if you want to!
I've also picked up snarrow, unarrow, uunarrow and fvdemote.
I've started on icmp, which doesn't look like it's going to be fun!
I've finished up translating icmp for x64
Transition to ISLE is now complete for s390x. All opcodes are now lowered via ISLE.
I've made a PR for the aarch64 min/max instructions
Last week we also merged bmask/bextend/ireduce/breduce for aarch64
I've picked up iabs for aarch64.
And also picked up swizzle and scalartovector.
Edit: also planning to do fadd down to fma.
I'm working on finishing the x64 migration to ISLE. Feel free to grab instructions if you'd like to work on them, otherwise I'll continue working down the list.
@dheaton-arm I'm planning to help push the aarch64 work to completion; I'm hoping to tackle some of the trickier remaining ones (loads and stores, with amode lowering; calls; branches; icmp/fcmp and flags users). Is that OK or have you already started on some of these?
I've already started on icmp and fcmp, but everything you've mentioned before those should be fine.
@cfallin I have an upcoming patch that moves the AMode enum definition to ISLE, but I don't plan to touch the actual lowering rules for loads and stores. We already have the amode helper in ISLE, so the latter bit should be doable without waiting for my patch to land (of course, AMode lowering itself would depend on it).
My patch changes quite a lot of code because, as far as I can tell, enum definitions in ISLE always result in enums with named fields, while the existing AMode enum has unnamed fields.
OK great, I'll take loads/stores themselves, calls, and branches. Lowering of flags-using instructions will intersect with how we do icmp/fcmp so I'll hold off on those. Thanks!
My patch changes quite a lot of code because, as far as I can tell, enum definitions in ISLE always result in enums with named fields, while the existing AMode enum has unnamed fields.
Yep, but there's no fundamental reason for that, it was just a "build it as we need it" sort of thing. We could look at supporting unnamed-field enum variants if it's simpler; it would probably be a 1-2 day refactor.
Put up PRs for call/ret and loads/stores today; the remaining instructions on aarch64 that don't depend on flags or pattern-matching with icmp/fcmp somehow are:
- GetPinnedReg/SetPinnedReg
- ExtractLane/InsertLane
- StackAddr
- IaddIfcout
- TlsValue
I'm happy to do all of these tomorrow, unless someone else objects or has partial work here, then I can do the branch and flags-related ones (trueif/trueff, selectif/selectff/select, brif/brff/brz/brnz) once @dheaton-arm 's icmp/fcmp work is done.
The x86_64 backend has been migrated to ISLE :tada:
I'm happy to do all of these tomorrow, unless someone else objects or has partial work here, then I can do the branch and flags-related ones (trueif/trueff, selectif/selectff/select, brif/brff/brz/brnz) once @dheaton-arm 's icmp/fcmp work is done.
@cfallin Just as a heads up, after my icmp patch I've been working on true{if,ff} and select{_,if,if_spectreguard}. I haven't started on the br* ops though. :)
@dheaton-arm great! I shifted over to some regalloc semantics cleanup work but I can come back and do br* ops next, likely Thu or Fri, unless you want to claim then before then!
@dheaton-arm great! I shifted over to some regalloc semantics cleanup work but I can come back and do
br*ops next, likely Thu or Fri, unless you want to claim then before then!
I'm starting on the branch ops now, assuming you haven't yet. (Starting with jump and br_icmp..brff.)
@dheaton-arm that sounds great; I didn't get to it last week as the regalloc stuff took longer than expected. Please feel free to grab all the branch ops and let me know if you need any help with any of the infra for that!
The AArch64 backend has now been fully ported to ISLE.