Stefanos Kornilios Mitsis Poiitidis

Results 336 comments of Stefanos Kornilios Mitsis Poiitidis

Implemented multiple entry points in `skmp/multiple-entry-points` (on top of optihacks-4). Perf results are mixed, with bytemark being slightly slower, FTL gaining a few FPS in complex points, and metro being...

emfloat/fpemulation benchmark hits a pathological case of cmovcc/setcc. Fixing it tripled perf

Added block sorting on the frontend to avoid out of order jumps in the backend. FEX ``` NUMERIC SORT : 581.03 : 14.90 : 4.89 STRING SORT : 140.48 :...

# Merges from `skmp/optihacks-4` # Merged ## Done in IR improvements (#484) - [x] 275c390 IR: Fix writer to handle more RA classes ## Done in profile improvements (#485) -...

With first SRA impl (`skmp/optihacks-5`) ``` --------------------:------------------:-------------:------------ NUMERIC SORT : 793.81 : 20.36 : 6.69 STRING SORT : 168.68 : 75.37 : 11.67 BITFIELD : 4.6885e+08 : 80.42 : 16.80...

SRA + some mov elim + properly cooled laptop ``` --------------------:------------------:-------------:------------ NUMERIC SORT : 1107.1 : 28.39 : 9.32 STRING SORT : 191.97 : 85.78 : 13.28 BITFIELD : 5.8174e+08...

SRA + full width mov elim + cooled laptop ``` --------------------:------------------:-------------:------------ NUMERIC SORT : 1205.9 : 30.93 : 10.16 STRING SORT : 208.36 : 93.10 : 14.41 BITFIELD : 5.8431e+08...

libpng decode mainloop translation example -- codegen is starting to look quite optimal in some cases ``` str x4, [x24, #48] mov x4, #0x3f51 // #16209 ldr x5, [x28, #88]...

With experiemtal SRA16 (uses 6 caller saved regs) ``` --------------------:------------------:-------------:------------ NUMERIC SORT : 1342.4 : 34.43 : 11.31 STRING SORT : 205.16 : 91.67 : 14.19 BITFIELD : 5.8536e+08 :...

With SRA16+16, some frontend improvements ``` --------------------:------------------:-------------:------------ NUMERIC SORT : 1387.8 : 35.59 : 11.69 STRING SORT : 264.09 : 118.00 : 18.26 BITFIELD : 5.8385e+08 : 100.15 : 20.92...