Teach implVecElemLval to emit operations better suited for AArch64
When creating an lval pair for the value and the type we currently bundle the entire address calculation in a Vptr, which stays bundled with each load until lowering to assembly. This has the disadvantage that we end up duplicating work constructing addresses that only differ by 1 byte. I think it's better to use explicit lea operations to set up a common base in advance, then construct a pair of Vptrs that only differ by a constant offset. We can then make use of the reg+imm addressing mode for the loads.
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D86211136. (Because this pull request was imported automatically, there will not be any future comments.)