The encoders are _big_

Open sorear opened this issue 5 years ago • 1 comments

Some numbers from a --target=riscv --exp_cut=99999 build (9a0180e sexpr-bootstrap):

1069060 cml_x64Prog_x64_enc_5306
 990976 cml_arm7Prog_arm7_enc_5296
 938348 cml_riscvProg_riscv_enc_5348
 784956 cml_arm8Prog_arm8_enc_5342
 373572 cml_mipsProg_mips_enc_5352
 291168 cml_ag32_enc_5324

Some of this is probably unavoidable with heap-allocated words but looking at the HOL input to the translator in riscvProgScript and x64ProgScript we seem to be forcing the encoders into a highly canonical form where no logic can be shared between different instructions; particularly notable are the 16 reg/imm (not)(lower, less, equal, test) cases of JumpCmp which result in 16 copies of the logic to encode a conditional jump displacement. Improving the pre-translate conversions here could make the bootstrap significantly faster with no effect whatsoever on compiling things in the logic.

Oct 15 '20 01:10 sorear

That latter issue you mention (16 copies of same code) seems like it might be a result of too much manual inlining on my part when I was setting up those translations.

It should certainly be possible to extract out that logic again into a separate function if it is indeed shared code.

Oct 15 '20 10:10 tanyongkiam