Missed optimisation: add immediate after lea
The example source that produces this code is at https://pushbx.org/ecm/test/20211114/test.c
It is taken from https://hg.pushbx.org/ecm/interc3/file/77616b6c4040/INTERPRE.C#l184 (The interpre source is in the Public Domain, intercep component under Fair License.)
While trying to shrink the required example code the register allocation seemed to change so that the code in question wasn't produced any longer. That's why the test.c file has the entire interpret_file function.
The C source is compiled as ia16-elf-gcc -Wall -fpack-struct -mcmodel=small -Os test.c -masm=intel -S -o test.s
This is the relevant code:
swi_info rec;
swi_info_amis * amis;
...
if (format && (rec.intnum & 0xFF00) != 0) {
switch (rec.intnum) {
case 0x100:
amis = (swi_info_amis *)&rec;
fprintf(ofp, "Multiplexer replied:"
" %04X:%04X -> \"%8.8s\" \"%8.8s\""
" version %04X\n",
amis->segment, amis->offset,
amis->vendor, amis->product,
amis->version);
This is the assembly generated from this source:
.L14:
cmp word ptr [bp+20], 0
je .L5
mov ax, word ptr [bp-402]
test ah, -1
je .L5
cmp ax, 256
jne .L4
push word ptr [bp-406]
lea ax, [-426+bp]
add ax, 8
push ax
lea ax, [-426+bp]
push ax
push word ptr [bp-410]
push word ptr [bp-408]
mov ax, offset .LC2
push ax
push word ptr [bp+12]
call fprintf
add sp, 14
jmp .L4
The part that I made this feature request for is:
lea ax, [-426+bp]
add ax, 8
push ax
The add with an immediate could be folded into the lea.
Bonus optimisation:
test ah, -1
je .L5
I believe test ah, ah would be one byte shorter.
This is the header containing the packed structures used by the source: https://hg.pushbx.org/ecm/interc3/file/77616b6c4040/intercep.h
#include <stdint.h>
/* what we record in our memory block about each interrupt */
typedef struct __attribute__ ((__packed__)) {
uint16_t bp, di, si, ds, es, dx;
uint16_t cx, bx, ax;
uint16_t ip;
uint16_t cs;
uint16_t flags;
uint16_t intnum;
} swi_info;
typedef struct __attribute__ ((__packed__)) {
uint8_t vendor[8];
uint8_t product[8];
uint16_t offset;
uint16_t segment;
uint16_t version;
uint16_t reserved;
uint16_t intnum;
} swi_info_amis;