gigatron-rom
gigatron-rom copied to clipboard
lcc: overflow of relational operators
The result of SUBW doesn't indicate if an overflow has occurred. Consequently, vCPU offers only signed comparisons against 0 for branching decisions. With that, the relational operators a<b, a<=b, a a>=b a>b only work when the difference between a and b fits in 15 bits. Ultimately this idiosyncrasy stems from the lack of status register in the 8-bit hardware architecture, and the underlying design idea that software can always compensate for missing hardware features...
TinyBASIC_v2.gt1 uses the following sequence to get correct comparisons over the full range:
0461 ee 02 LDLW $02 |..|
0463 fc 3a XORW $3a |.:|
0465 35 53 6a BGE $046c |5Sj|
0468 ee 02 LDLW $02 |..|
046a 90 6e BRA $0470 |.n|
046c ee 02 LDLW $02 |..|
046e b8 3a SUBW $3a |.:|
0470 35 56 73 BLE $0475 |5Vs|
So prior to the standard subtract, it first checks if the operand highest bits are equal. If equal, it perform the normal SUBW for comparison. If not equal, the first operand is loaded for BLE/BLT (or the second operand in the case of BGE/BGT). This sequence costs 8 vCPU instructions or 18 bytes here, compared to 3/7 for the naive sequence. This is the reason that Tiny BASIC uses this idiom selectively.
For a compliant C compiler we need something similar. For example, now 0xffffu compares as smaller than 0. And we also can't really say that ints are just 15-bits wide instead.
My plan:
- First bite the bullet and implement it correctly for < <= > and >=. Try hard if we can get away with at most one new helper function in rt.py ("@prepcmp"?). For example, implement only in one order and reverse the order of operands if necessary.
- Provide macros/instrinsics/builtins to give access to the the naive relational operators when so desired by the programmer. Something like LT(a,b), GT(a,b), LE(a,b) and GE(a,b).
- Optimise away the prelude when comparing against a zero constant.
- Think deep about further optimisation options. For example, try to deduce the high bit of both sides by static analysis...?
- Consider some trivial rewrites. Eg. comparisons of an unsigned integer against 0 can be always be replaced by one of true/false/NE/EQ.
Re 4, perhaps you could add a compiler built-in that users could mark variables they don't expect full 16-bit comparisons on? That could then aid with the static analysis.
[Edit MvK: removed quoted e-mail]
Yes, this concept needs to evolve, hence the issue to track the thought process
My idea of prioritising things in LCC are:
- Correctness
- Usability (e.g. error messages, 64K support, essential library functions, ...)
- Optimisations ...
- Floating point :-)
This has the potential to make a lot of code very inefficient, and I like to know the impact. For example:
if (putc(…) < 0)… can become nasty, whereas
if (putc(…) == EOF) … has potentially an efficient translation (using ADDI 1 + BNE)
Perhaps we can squeeze in another new vCPU instruction that helps for these: "CMPW $DD".
ROM v5 will have CMPHS
and CMPHU
instructions that are designed to solve this. See also https://github.com/kervinck/gigatron-rom/commit/f584a2aa986949fcb42cb4606005d74e1ebc395a
Suggesting to close this issue since glcc does this correctly already (on both roms v4 --with code-- and v5a+ --with cmphi/cmphs).