cakeml icon indicating copy to clipboard operation
cakeml copied to clipboard

Better encoding for immediate constants

Open sorear opened this issue 5 years ago • 0 comments

  • ag32: appears to be optimal.
  • arm7: We generate 12-byte sequences to load from the instruction stream for everything outside a 12-bit mov immediate range. But since we depend on arm7 we can use 16-bit immediates in movw and movt to load any constant in 2 instructions, 8 bytes. ~90% of the 12-byte sequences in the bootstrap compiler (60k excluding basisprog_basis, 95k including) are < 0x10000 and could be handled as one instruction, estimated saving 400kb.
  • arm8: For anything outside the range of a simple movz/movn, we generate a sequence of 4 movk (16 bytes). But if we generate the first one as movz (movn) we can eliminate any following movk for 0 (FFFF). This particularly happens for header words, which usually have nonzero bits only around 32 and 0. A basisprog-less arm8 compiler has 104k movk of 0, which could probably be omitted, saving 400kb.
  • mips, riscv: The most general (non-rvc) setup for a 64-bit constant requires 6 instructions and 24 bytes. The general optimization of an immediate as a sequence of shifts and adds is difficult to model, but the large majority of the 40k 64-bit constants in a basisless bootstrapped compiler are header words which can be constructed as "load immediate, shift left 32, add immediate", reducing to 12 bytes and saving 480kb.
  • x64: No easy way to load a register with less than 32 bits, but there are 26k zeroings in the basisless bootstrapped compiler, which could be replaced with xors saving 75k.

sorear avatar Sep 24 '20 20:09 sorear