Use real calls
CakeML currently uses a link register calling convention on all targets, but even on targets where that is native it does not use the "jump and link" instruction. This could be done without modifying any global invariants, but since it would require an implied label at the end of an instruction it could be somewhat tricky with the labLang semantics. Benefits are likely to be largest on riscv (remove ~800kB of auipc/addi or bltzal/daddiu pairs from the bootstrapped compiler) and half that on arm7/arm8. Theoretically there is also a branch prediction accuracy benefit but it is hard to predict.
On x64 we would like to use the CALL instruction, but this is much more work since it writes directly into the stack; there is a potentially cross-cutting semantics change needed to allow that. POWER and SuperH have an intermediate approach where the link register exists but is not a GPR.