Trampolines as a code size optimization
Several targets have a jump encoding breakpoint that can plausibly reach a large number of functions, but not the entire binary. (mips conditionals 32K, riscv unconditional & arm8 conditional 1M, arm7 T32 16M, arm7 T32 32M). Our lab_to_target already performs a global optimization to find the jump for a given pair of instructions, so that works fine. However for a few functions which are dynamically cold but have a large number of callees, such as the garbage collector and (depending on workload) bignum support code, it may be advantageous to generate trampolines: functions consisting of a single long jump, which many other functions can make a short jump to. (Most arm7 linkers have a similar function for correctness, since compilers generate short jumps only and the ELF object files do not have enough information to rewrite short jumps into long jumps in place.)